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PREFACE TO THE FOURTH EDITION 


THE main change from the third edition is that the chapter on quantum 
electrodynamics has been rewritten. The quantum electrodynamics 
given in the third edition describes the motion of individual charged 
particles moving through the electromagnetic field, in close analogy 
with classical electrodynamics. It is a form of theory in which the 
number of charged particles is conserved and it cannot be generalized 
to allow of variation of the number of charged particles. 

In present-day high-energy physics the creation and annihilation 
of charged particles is a frequent occurrence. A quantum electro- 
dynamics which demands conservation of the number of charged 
particles is therefore out of touch with physical reality. So I have 
replaced it by a quantum electrodynamics which includes creation and 
annihilation of electron-positron pairs. This involves abandoning any 
close analogy with classical electron theory, but provides a closer 
description of nature. It seems that the classical concept of an electron 
is no longer a useful model in physics, except possibly for elementary 
theories that are restricted to low-energy phenomena. 

Pp. A. M. D. 
ST, JOHN’S COLLEGE, CAMBRIDGE 
11 May 1957 


NOTE TO THE REVISION OF THE 
FOURTH EDITION 
THE opportunity has been taken of revising parts of Chapter XII 


(‘Quantum electrodynamics’) and of adding two new sections on 
interpretation and applications. Fr; A, wa, D. 


8ST. JOHN’S COLLEGE, CAMBRIDGE 
26 May 1967 


FROM THE 
PREFACE TO THE FIRST EDITION 


THE methods of progress in theoretical physics have undergone a 
vast change during the present century. The classical tradition 
has been to consider the world to be an association of observable 
objects (particles, fluids, fields, etc.) moving about according to 
definite laws of force, so that one could form a mental picture in 
space and time of the whole scheme. This led to a physics whose aim 
was to make assumptions about the mechanism and forces connecting 
these observable objects, to account for their behaviour in the 
simplest possible way. It has become increasingly evident in recent 
times, however, that nature works on a different plan. Her funda- 
mental laws do not govern the world as it appears in our mental 
picture in any very direct way, but instead they control a substra- 
tum of which we cannot form a mental picture without intro- 
ducing irrelevancies. The formulation of these laws requires the use 
of the mathematics of transformations. The important things in 
the world appear as the invariants (or more generally the nearly 
invariants, or quantities with simple transformation properties) 
of these transformations. The things we are immediately aware of 
are the relations of these nearly invariants to a certain frame of 
reference, usually one chosen so as to introduce special simplifying 
features which are unimportant from the point.of view of general 
theory. 

The growth of the use of transformation theory, as applied first to 
relativity and later to the quantum theory, is the essence of the new 
method in theoretical physics. Further progress lies in the direction 
of making our equations invariant under wider and still wider trans- 
formations. This state of affairs is very satisfactory from a philo- 
sophical point of view, as implying an increasing recognition of the 
part played by the observer in himself introducing the regularities 
that appear in his observations, and a lack of arbitrariness in the ways 
of nature, but it makes things less easy for the learner of physics. 
The new theories, if one looks apart from their mathematical setting, 
are built up from physical concepts which cannot be explained in 
terms of things previously known to the student, which cannot even 
be explained adequately in words at all. Like the fundamental con- 
cepts (e.g. proximity, identity) which every one must learn on his 
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arrival into the world, the newer concepts of physics can be mastered 
only by long familiarity with their properties and uses. 

From the mathematical side the approach to the new theories 
presents no difficulties, as the mathematics required (at any rate that 
which is required for the development of physics up to the present) 
is not essentially different from what has been current for a consider- 
able time. Mathematics is the tool specially suited for dealing with 
abstract concepts of any kind and there is no limit to its power in this 
field. For this reason a book on the new physics, if not purely descrip- 
tive of experimental work, must be essentially mathematical. All the 
same the mathematics is only a tool and one should learn to hold the 
physical ideas in one’s mind without reference to the mathematical 
form. In this book I have tried to keep the physics to the forefront, 
by beginning with an entirely physical chapter and in the later work 
examining the physical meaning underlying the formalism wherever 
possible. The amount of theoretical ground one has to cover before 
being able to solve problems of real practical value is rather large, but 
this circumstance is an inevitable consequence of the fundamental 
part played by transformation theory and is likely to become more 
pronounced in the theoretical physics of the future. 

With regard to the mathematical form in which the theory can be 
presented, an author must decide at the outset between two methods. 
There is the symbolic method, which deals directly in an abstract way 
with the quantities of fundamental importance (the invariants, etc., 
of the transformations) and there is the method of coordinates or 
representations, which deals with sets of numbers corresponding to 
these quantities. The second of these has usually been used for the 
presentation of quantum mechanics (in fact it has been used practi- 
cally exclusively with the exception of Weyl’s book Gruppentheorie 
und Quantenmechanik). It is known under one or other of the two 
names ‘Wave Mechanics’ and ‘Matrix Mechanics’ according to which 
physical things receive emphasis in the treatment, the states of a 
system or its dynamical variables. It has the advantage that the kind 
of mathematics required is more familiar to the average student, and 
also it is the historical method. 

The symbolic method, however, seems to go more deeply into the 
nature of things. It enables one to exvress the physical laws in a neat 
and concise way, and will probably be increasingly used in the future 
as it becomes better understood and its own special mathematics gets 
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developed. For this reason I have chosen the symbolic method, 
introducing the representatives later merely as an aid to practical 
calculation. This has necessitated a complete break from the histori- 
cal line of development, but this break is an advantage through 
enabling the approach to the new ideas to be made as direct as 
possible. 


Pp. A. M. D. 
ST. JOHN’S COLLEGE, CAMBRIDGE 


29 May 1930 
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1 
THE PRINCIPLE OF SUPERPOSITION 


1. The need for a quantum theory 

CLassioAL mechanics has been developed continuously from the time 
of Newton and applied to an ever-widening range of dynamical 
systems, including the electromagnetic field in interaction with 
matter. The underlying ideas and the laws governing their applica- 
tion form a simple and elegant scheme, which one would be inclined 
to think could not be seriously modified without having all its 
attractive features spoilt. Nevertheless it has been found possible to 
set up a new scheme, called quantum mechanics, which is more 
suitable for the description of phenomena on the atomic scale and 
which is in some respects more elegant and satisfying than the 
classical scheme. This possibility is due to the changes which the 
new scheme involves being of a very profound character and not 
clashing with the features of the classical theory that make it so 
attractive, as a result of which all these features can be incorporated 
in the new scheme. 

The necessity for a departure from classical mechanics is clearly 
shown by experimental results. In the first place the forces known 
in classical electrodynamics are inadequate for the explanation of the 
remarkable stability of atoms and molecules, which is necessary in 
order that materials may have any definite physical and chemical 
properties at all. The introduction of new hypothetical forces will not 
save the situation, since there exist general principles of classical 
mechanics, holding for all kinds of forces, leading to results in direct 
disagreement with observation. For example, if an atomic system has 
its equilibrium disturbed in any way and is then left alone, it will beset 
in oscillation and the oscillations will get impressed on the surround- 
ing electromagnetic field, so that their frequencies may be observed 
with a spectroscope. Now whatever the laws of force governing the 
equilibrium, one would expect to be able to include the various fre- 
quencies in a scheme comprising certain fundamental frequencies and 
their harmonics. This is not observed to be the case. Instead, there 
ig observed a new and unexpected connexion between the frequencies, 
called Ritz’s Combination Law of Spectroscopy, according to which all 
the frequencies can be expressed as differences between certain terms, 
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the number of terms being much less than the number of frequencies. 
This law is quite unintelligible from the classical standpoint. 

One might try to get over the difficulty without departing from 
classical mechanics by assuming each of the spectroscopically ob- 
served frequencies to be a fundamental frequency with its own degree 
of freedom, the laws of force being such that the harmonic vibrations 
do not occur. Such a theory will not do, however, even apart from 
the fact that it would give no explanation of the Combination Law, 
since it would immediately bring one into conflict with the experi- 
mental evidence on specific heats. Classical statistical mechanics 
enables one to establish a general connexion between the total number 
of degrees of freedom of an assembly of vibrating systems and its 
specific heat. If one assumes all the spectroscopic frequencies of an 
atom to correspond to different degrees of freedom, one would get a 
specific heat for any kind of matter very much greater than the 
observed value. In fact the observed specific heats at ordinary 
temperatures are given fairly well by a theory that takes into account 
merely the motion of each atom as a whole and assigns no internal 
motion to it at all. 

This leads us to a new clash between classical mechanics and the 
results of experiment. There must certainly be some internal motion 
in an atom to account for its spectrum, but the internal degrees of 
freedom, for some classically inexplicable reason, do not contribute 
to the specific heat. A similar clash is found in connexion with the 
energy of oscillation of the electromagnetic fieldina vacuum. Classical 
mechanics requires the specific heat corresponding to this energy to 
be infinite, but it is observed to be quite finite. A general conclusion 
from experimental results is that oscillations of high frequency do 
not contribute their classical quota to the specific heat. 

As another illustration of the failure of classical mechanics we may 
consider the behaviour of light. We have, on the one hand, the 
phenomena of interference and diffraction, which can be explained 
only on the basis of a wave theory; on the other, phenomena such as 
photo-electric emission and scattering by free electrons, which show 
that light is composed of small particles. These particles, which 
are called photons, have each a definite energy and momentum, de- 
pending on the frequency of the light, and appear to have just as 
real an existence as electrons, or any other particles known in physics. 
A fraction of a photon is never observed. 


§1 THE NEED FOR A QUANTUM THEORY 3 


Experiments have shown that this anomalous behaviour is not 
peculiar to light, but is quite general. All material particles have 
wave properties, which can be exhibited under suitable conditions. 
We have here a very striking and general example of the breakdown 
of classical mechanics—not merely an inaccuracy in its laws of motion, 
but an inadequacy of its concepts to supply us with a description of 
atomic events. 

The necessity to depart from classical ideas when one wishes to 
account for the ultimate structure of matter may be seen, not only 
from experimentally established facts, but also from general philo- 
sophical grounds. In a classical explanation of the constitution of 
matter, one would assume it to be made up of a large number of small 
constituent parts and one would postulate laws for the behaviour of 
these parts, from which the laws of the matter in bulk could be de- 
duced. This would not complete the explanation, however, since the 
question of the structure and stability of the constituent parts is left 
untouched. To go into this question, it becomes necessary to postu- 
late that each constituent part is itself made up of smaller parts, in 
terms of which its behaviour is to be explained. There is clearly no 
end to this procedure, so that one can never arrive at the ultimate 
structure of matter on these lines. So long as big and small are merely 
relative concepts, it is no help to explain the big in terms of the small. 
It is therefore necessary to modify classical ideas in such a way as to 
give an absolute meaning to size. 

At this stage it becomes important to remember that science is 
concerned only with observable things and that we can observe an 
object only by letting it interact with some outside influence. An act 
of observation is thus necessarily accompanied by some disturbance 
of the object observed. We may define an object to be big when the 
disturbance accompanying our observation of it may be neglected, 
and small when the disturbance cannot be neglected. This definition 
is in close agreement with the common meanings of big and small. 

It is usually assumed that, by being careful, we may cut down the 
disturbance accompanying our observation to any desired extent. 
The concepts of big and small are then purely relative and refer to the 
gentleness of our means of observation as well as to the object being 
described. In order to give an absolute meaning to size, such as is 
required for any theory of the ultimate structure of matter, we have 
to assume that there is a limit to the fineness of our powers of observation 
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and the smallness of the accompanying disturbance—a limit which as 
inherent in the nature of things and can never be surpassed by improved 
technique or increased skill on the part of the observer. Ifthe object under 
observation is such that the unavoidable limiting disturbance is negli- 
gible, then the object is big in the absolute sense and we may apply 
classical mechanics to it. If, on the other hand, the limiting dis- 
turbance is not negligible, then the object is small in the absolute 
sense and we require a new theory for dealing with it. 

A consequence of the preceding discussion is that we must revise 
our ideas of causality. Causality applies only to a system which is 
left undisturbed. If a system is small, we cannot observe it without 
producing a serious disturbance and hence we cannot expect to find 
any causal connexion between the results of our observations. 
Causality will still be assumed to apply to undisturbed systems and 
the equations which will be set up to describe an undisturbed system 
will be differential equations expressing a causal connexion between 
conditions at one time and conditions at a later time. These equations 
will be in close correspondence with the equations of classical 
mechanics, but they will be connected only indirectly with the results 
of observations. There is an unavoidable indeterminacy in the calcu- 
lation of observational results, the theory enabling us to calculate in 
general only the probability of our obtaining a particular result when 
we make an observation. 


2. The polarization of photons 

The discussion in the preceding section about the limit to the 
gentleness with which observations can be made and the consequent 
indeterminacy in the results of those observations does not provide 
any quantitative basis for the building up of quantum mechanics. 
For this purpose a new set of accurate laws of nature is required. 
One of the most fundamental and most drastic of these is the Principle 
of Superposition of States. We shall lead up to a general formulation 
of this principle through a consideration of some special cases, taking 
first the example provided by the polarization of light. 

It is known experimentally that when plane-polarized light is used 
for ejecting photo-electrons, there is a preferential direction for the 
electron emission. Thus the polarization properties of light are closely 
connected with its corpuscular properties and one must ascribe a 
polarization to the photons. One must consider, for instance, a beam 
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of light plane-polarized in a certain direction as consisting of photons 
each of which is plane-polarized in that direction and a beam of 
circularly polarized light as consisting of photons each circularly 
polarized. Every photon is in a certain state of polarization, as we 
shall say. The problem we must now consider is how to fit in these 
ideas with the known facts about the resolution of light into polarized 
components and the recombination of these components. 

Let us take a definite case. Suppose we have a beam of light passing 
through a crystal of tourmaline, which has the property of letting 
through only light plane-polarized perpendicular to its optic axis. 
Classical electrodynamics tells us what will happen for any given 
polarization of the incident beam. If this beam is polarized per- 
pendicular to the optic axis, it will all go through the crystal; if 
parallel to the axis, none of it will go through; while if polarized at 
an angle « to the axis, a fraction sin*« will go through. How are we 
to understand these results on a photon basis? 

A beam that is plane-polarized in a certain direction is to be 
pictured as made up of photons each plane-polarized in that 
direction. This picture leads to no difficulty in the cases when our 
incident beam is polarized perpendicular or parallel to the optic axis. 
We merely have to suppose that each photon polarized perpendicular 
to the axis passes unhindered and unchanged through the crystal, 
while each photon polarized parallel to the axis is stopped and ab- 
sorbed. A difficulty arises, however, in the case of the obliqucly 
polarized incident beam. Each of the incident photons is then 
obliquely polarized and it is not clear what will happen to such a 
photon when it reaches the tourmaline. 

A question about what will happen to a particular photon under 
certain conditions is not really very precise. To make it precise one 
must imagine some experiment performed having a bearing on the 
question and inquire what will be the result of the experiment. Only 
questions about the results of experiments have a real significance 
and it is only such questions that theoretical physics has to consider. 

In our present example the obvious experiment is to use an incident 
beam consisting of only a single photon and to observe what appears 
on the back side of the crystal. According to quantum mechanics 
the result of this experiment will be that sometimes one will find a 
whole photon, of energy equal to the energy of the incident photon, 
on the back side and other times one will find nothing. When one 
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finds a whole photon, it will be polarized perpendicular to the optic 

axis. One will never find only a part of a photon on the back side. 

If one repeats the experiment a large number of times, one will find 

the photon on the back side in a fraction sin’« of the total number 

of times. Thus we may say that the photon has a probability sin?a_ 
of passing through the tourmaline and appearing on the back side 

polarized perpendicular to the axis and a probability cos’a of being 

absorbed. These values for the probabilities lead to the correct 

classical results for an incident beam containing a large number of 

photons. 

In this way we preserve the individuality of the photon in all 
cases. We are able to do this, however, only because we abandon the 
determinacy of the classical theory. The result of an experiment is 
not determined, as it would be according to classical ideas, by the 
conditions under the control of the experimenter. The most that can 
be predicted is a set of possible results, with a probability of occur- 
rence for each. 

The foregoing discussion about the result of an experiment with a 
single obliquely polarized photon incident on a crystal of tourmaline 
answers all that can legitimately be asked about what happens to an 
obliquely polarized photon when it reaches the tourmaline. Questions 
about what decides whether the photon is to go through or not and 
how it changes its direction of polarization when it does go through 
cannot be investigated by experiment and should be regarded as 
outside the domain of science. Nevertheless some further description 
is necessary in order to correlate the results of this experiment with 
the results of other experiments that might be performed with 
photons and to fit them all into a general scheme. Such further 
description should be regarded, not as an attempt to auswer questions 
outside the domain of science, but as an aid to the formulation of 
rules for expressing concisely the results of large numbers of experi- 
ments. 

The further description provided by quantum mechanics runs as 
follows. It is supposed that a photon polarized obliquely to the optic 
axis may be regarded as being partly in the state of polarization 
parallel to the axis and partly in the state of polarization perpen- 
dicular to the axis. The state of oblique polarization may be con- 
sidered as. the result of some kind of superposition process applied to 
the two states of parallel and perpendicular polarization. This implies 
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a certain special kind of relationship between the various states of 
polarization, a relationship similar to that between polarized beams in 
classical optics, but which is now to be applied, not to beams, but to 
the states of polarization of one particular photon. This relationship 
allows any state of polarization to be resolved into, or expressed as a 
superposition of, any two mutually perpendicular states of polari- 
zation. 

When we make the photon meet a tourmaline crystal, we are sub- 
jecting it to an observation. We are observing whether it is polarized 
parallel] or perpendicular to the optic axis. The effect of making this 
observation is to force the photon entirely into the state of parallel 
or entirely into the state of perpendicular polarization. It has to 
make a sudden jump from being partly in each of these two states to 
being entirely in one or other of them. Which of the two states it will 
jump into cannot be predicted, but is governed only by probability 
laws. If it jumps into the parallel state it gets absorbed and if it 
jumps into the perpendicular state it passes through the crystal and 
appears on the other side preserving this state of polarization. 


3. Interference of photons 

In this section we shall deal with another example of superposition. 
We shall again take photons, but shall be concerned with their posi- 
tion in space and their momentum instead of their polarization. If 
we are given a beam of roughly monochromatic light, then we know 
something about the location and momentum of the associated 
photons. We know that each of them is located somewhere in the 
region of space through which the beam is passing and has a momen- 
tum in the direction of the beam of magnitude given in terms of the 
frequency of the beam by Einstein’s photo-electric law—momentum 
equals frequency multiplied by a universal constant. When we have 
such information about the location and momentum of a photon we 
shall say that it is in a definite translational state. 

We shall discuss the description which quantum mechanics pro- 
vides of the interference of photons. Let us take a definite experi- 
ment demonstrating interference. Suppose we have a beam of light 
which is passed through some kind of interferometer, so that it gets 
split up into two components and the two components are subse- 
quently made to interfere. We may, as in the preceding section, take 
an incident beam consisting of only a single photon and inquire what 
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will happen to it as it goes through the apparatus. This will present 
to us the difficulty of the conflict between the wave and corpuscular 
theories of light in an acute form. 

Corresponding to the description that we had in the case of the 
polarization, we must now describe the photon as going partly into 
each of the two components into which the incident beam is split. 
The photon is then, as we may say, ina translational state given by the 
superposition of the two translational states associated with the two 
components. We are thus led to a generalization of the term “trans- 
lational state’ applied to a photon. For a photon to be in a definite 
translational state it need not be associated with one single beam of 
light, but may be associated with two or more beams of light which 
are the components into which one original beam has been split.t In 
the accurate mathematical theory each translational state is associated 
with one of the wave functions of ordinary wave optics, which wave 
function may describe either a single beam or two or more beams 
into which one original beam has been split. Translational states are 
thus superposable in a similar way to wave functions. 

Let us consider now what happens when we determine the energy 
in one of the components. The result of such a determination must 
be either the whole photon or nothing at all. Thus the photon must 
change suddenly from being partly in one beam and partly in the 
other to being entirely in one of the beams. This sudden change is 
due to the disturbance in the translational state of the photon which 
the observation necessarily makes. It is impossible to predict in which 
of the two beams the photon will be found. Only the probability of 
either result can be calculated from the previous distribution of the 
photon over the two beams. 

One could carry out the energy measurement without destroying the 
component beam by, for example, reflecting the beam from a mévable 
mirror and observing the recoil. Our description of the photon allows 
us to infer that, after such an energy measurement, it would not be 
possible to bring about any interference effects between the two com- 
ponents. So long as the photon is partly in one beam and partly in 
the other, interference can occur when the two beams are superposed, 
but this possibility disappears when the photon is forced entirely into 


+ The circumstance that the superposition idea requires us to generalize our 
original meaning of translational states, but that no corresponding generalization was 
needed for the states of polarization of the preceding section, is an accidental one 
with no underlying theoretical significance. 
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one of the beams by an observation. The other beam then no longer 
enters into the description of the photon, so that it counts as being 
entirely in the one beam in the ordinary way for any experiment that 
may subsequently be performed on it. 

On these lines quantum mechanics is able to effect a reconciliation 
of the wave and corpuscular properties of light. The essential point 
is the association of each of the translational states of a photon with 
one of the wave functions of ordinary wave optics. The nature of this 
association cannot be pictured on a basis of classical mechanics, but 
is something entirely new. It would be quite wrong to picture the 
photon and its associated wave as interacting in the way in which 
particles and waves can interact in classical mechanics. The associa- 
tion can be interpreted only statistically, the wave function giving 
us information about the probability of our finding the photon in any 
particular place when we make an observation of where it is. 

Some time before the discovery of quantum mechanics people 
realized that the connexion between light waves and photons must 
be of a statistical character. What they did not clearly realize, how- 
ever, was that the wave function gives information about the proba- 
bility of one photon being in a particular place and not the probable 
number of photons in that place. The importance of the distinction 
can be made clear in the following way. Suppose we have a beam 
of light consisting of a large number of photons split up into two com- 
ponents of equal intensity. On the assumption that the intensity of 
a beam is connected with the probable number of photons in it, we 
should have half the total number of photons going into each com- 
ponent. If the two components are now made to interfere, we should 
require a photon in one component to be able to interfere with one in 
the other. Sometimes these two photons would have to annihilate one 
another and other times they would have to produce four photons. 
This would contradict the conservation of energy. The new theory, 
which connects the wave function with probabilities for one photon, 
gets over the difficulty by making each photon go partly into each of 
the two components. Each photon then interferes only with itself. 
Interference between two different photons never occurs. 

The association of particles with waves discussed above is not 
restricted to the case of light,. but is, according to modern theory, 
of universal applicability. All kinds of particles are associated with 
waves in this way and conversely all wave motion is associated with 
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particles. Thus all particles can be made to exhibit interference 
effects and all wave motion has its energy in the form of quanta. The 
reason why these general phenomena are not more obvious is on 
account of a law of proportionality between the mass or energy of the 
particles and the frequency of the waves, the coefficient being such 
that for waves of familiar frequencies the associated quanta are 
extremely small, while for particles even as light as electrons the 
associated wave frequency is so high that it is not easy to demonstrate 
interference. 


4. Superposition and indeterminacy 

The reader may possibly feel dissatisfied with the attempt in the 
two preceding sections to fit in the existence of photons with the 
classical theory of light. He may argue that a very strange idea has 
been introduced—the possibility of a photon being partly in each of 
two states of polarization, or partly in each of two separate beams— 
but even with the help of this strange idea no satisfying picture of 
the fundamental single-photon processes has been given. He may say 
further that this strange idea did not provide any information about 
experimental results for the experiments discussed, beyond what 
could have been obtained from an elementary consideration of 
photons being guided in some vague way by waves. What, then, is 
the use of the strange idea? 

In answer to the first criticism it may be remarked that the main 
object of physical science is not the provision of pictures, but is the 
formulation of laws governing phenomena and the application of 
these laws to the discovery of new phenomena. If a picture exists, 
so much the better; but whether a picture exists or not is a matter 
of only secondary importance. In the case of atomic phenomena 
no picture can be expected to exist in the usual sense of the word 
‘picture’, by which is meant a model functioning essentially on’ 
classical lines. One may, however, extend the meaning of the word - 
‘picture’ to include any way of looking at the fundamental laws which 
makes their self-consistency obvious. With this extension, one may 
gradually acquire a picture of atomic phenomena by becoming 
familiar with the laws of the quantum theory. 

With regard to the second criticism, it may be remarked that for 
many simple experiments with light, an elementary theory of waves 
and photons connected in a vague statistical way would be adequate 
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to account for the results. In the case of such experiments quantum 
mechanics has no further information to give. In the great majority 
of experiments, however, the conditions are too complex for an 
elementary theory of this kind to be applicable and some more 
elaborate scheme, such as is provided by quantum mechanics, is then 
needed. The method of description that quantum mechanics gives 
in the more complex cases is applicable also to the simple cases and 
although it is then not really necessary for accounting for the experi- 
mental results, its study in these simple cases is perhaps a suitable 
introduction to its study in the general case. 

There remains an overall criticism that one may make to the whole 
scheme, namely, that in departing from the determinacy of the 
classical theory a great complication is introduced into the descrip- 
tion of Nature, which is a highly undesirable feature. This complica- 
tion is undeniable, but it is offset by a great simplification, provided 
by the general principle of superposition of states, which we shall now 
go on to consider. But first it is necessary to make precise the impor- 
tant concept of a ‘state’ of a general atomic system. 

Let us take any atomic system, composed of particles or bodies 
with specified properties (mass, moment of inertia, etc.) interacting 
according to specified laws of force. There will be various possible 
motions of the particles or bodies consistent with the laws of force. 
Each such motion is called a state of the system. According to 
classical ideas one could specify a state by giving numerical values 
to all the coordinates and velocities of the various component parts 
of the system at some instant of time, the whole motion being then 
completely determined. Now the argument of pp. 3 and 4 shows that 
we cannot observe a small system with that amount of detail which 
classical theory supposes. The limitation in the power of observation 
puts a limitation on the number of data that can be assigned to a 
state. Thus a state of an atomic system must be specified by fewer 
or more indefinite data than a complete set of numerical values 
for all the coordinates and velocities at some instant of time. In the 
case when the system is just a single photon, a state would be com- 
pletely specitied by a given translational state in the sense of § 3 
together with a given state of polarization in the sense of § 2. 

A state of a system may be defined as an undisturbed motion that 
is restricted by as many conditions or data as are theoretically 
possible without mutual interference or contradiction. In practice 
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the conditions could be imposed by a suitable preparation of the 
system, consisting perhaps in passing it through various kinds of 
sorting apparatus, such as slits and polarimeters, the system being 
left undisturbed after the preparation. The word ‘state’ may be 
used to mean either the state at one particular time (after the 
preparation), or the state throughout the whole of time after the 
preparation. To distinguish these two meanings, the latter will be 
called a ‘state of motion’ when there is liable to be ambiguity. 

The general principle of superposition of quantum mechanics 
applies to the states, with either of the above meanings, of any one 
dynamical system. It requires us to assume that between these 
states there exist peculiar relationships such that whenever the 
system is definitely in one state we can consider it as being partly 
in each of two or more other states. The original state must be 
regarded as the result of a kind of superposition of the two or more 
new states, in a way that cannot be conceived on classical ideas. Any 
state may be considered as the result of a superposition of two or 
more other states, and indeed in an infinite number of ways. Con- 
versely any two or more states may be superposed to give a new 
state. The procedure of expressing a state as the result of super- 
position of a number of other states is a mathematical procedure 
that is always permissible, independent of any reference to physical 
conditions, like the procedure of resolving a wave into Fourier com- 
ponents. Whether it is useful in any particular case, though, depends 
on the special physical conditions of the problem under consideration. 

In the two preceding sections examples were given of the super- 
position principle applied to a system consisting of a single photon. 
§ 2 dealt with states differing only with regard to the polarization and 
§ 3 with states differing only with regard to the motion of the photon 
as a whole. . 

The nature of the relationships which the superposition principle 
requires to exist between the states of any system is of a kind that 
cannot be explained in terms of familiar physical concepts. One 
cannot in the classical sense picture a system being partly in each of 
two states and see the equivalence of this to the system being com- 
pletely in some other state. There is an entirely new idea involved, 
to which one must get accustomed and in terms of which one must 
proceed to build up an exact mathematical theory, without having 
any detailed classical picture. 
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When a state is formed by the superposition of two other states, 
it will have properties that are in some vague way intermediate 
between those of the two original states and that approach more or 
less closely to those of either of them according to the greater or less 
‘weight’ attached to this state in the superposition process. The new 
state is completely defined by the two original states when their 
relative weights in the superposition process are known, together 
with a certain phase difference, the exact meaning of weights and 
phases being provided in the general case by the mathematical theory. 
In the case of the polarization of a photon their meaning is that pro- 
vided by classical optics, so that, for example, when two perpendicu- 
larly plane polarized states are superposed with equal weights, the 
new state may be circularly polarized in either direction, or linearly 
polarized at an angle jz, or else elliptically polarized, according to 
the phase difference. 

The non-classical nature of the superposition process is brought 
out clearly if we consider the superposition of two states, A and B, 
such that there exists an observation which, when made on the 
system in state A, is certain to lead to one particular result, a say, and 
when made on the system in state B is certain to lead to some different 
result, b say. What will be the result of the observation when made 
on the system in the superposed state? The answer is that the result 
will be sometimes a and sometimes b, according to a probability law 
depending on the relative weights of A and B in the superposition 
process. It will never be different from both a and b. The inter- 
mediate character of the state formed by superposition thus expresses 
itself through the probability of a particular result for an observation 
being intermediate between the corresponding probabilities for the original 
states,t not through the result itself being intermediate between the 

corresponding results for the original states. 

"Tn this way we see that such a drastic departure from ordinary 
ideas as the assumption of superposition relationships between the 
states is possible only on account of the recognition of the importance 
of the disturbance accompanying an observation and of the conse- 
quent indeterminacy in the result of the observation. When an 
observation is made on any atomic system that is in a given state, 


{ The probability of a particular result for the state formed by superposition is not 
always intermediate between those for the original states in the general case when 
those for the original states are not zero or unity, so there are restrictions on the 
‘intermediateness’ of a state formed by superposition. 
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in general the result will not be determinate, i.e., if the experiment 
is repeated several times under identical conditions several different 
results may be obtained. It is a law of nature, though, that if the 
experiment is repeated a large number of times, each particular result 
will be obtained in a definite fraction of the total number of times, so 
that there is a definite probability of its being obtained. This proba- 
bility is what the theory sets out to calculate. Only in special cases 
when the probability for some result is unity is the result of the 
experiment determinate. 

The assumption of superposition relationships between the states 
leads to a mathematical theory in which the equations that define 
a state are linear in the unknowns. In consequence of this, people 
have tried to establish analogies with systems in classical mechanics, 

' such as vibrating strings or membranes, which are governed by linear 
- equations and for which, therefore, a superposition principle holds. 
Such analogies have led to the name ‘Wave Mechanics’ being some- 
times given to quantum mechanics. It is important to remember, 
however, that the superposition that occurs in quantum mechanics is 
of an essentially different nature from any occurring in the classical 
theory, as is shown by the fact that the quantum superposition prin- 
ciple demands indeterminacy in the results of observations in order 
to be capable of a sensible physical interpretation. The analogies are 
thus liable to be misleading. 


5. Mathematical formulation of the principle 

A profound change has taken place during the present century in 
the opinions physicists have held on the mathematical foundations 
of their subject. Previously they supposed that the principles of 
Newtonian mechanics would provide the basis for the description 
of the whole of physical phenomena and that all the theoretical 
physicist had to do was suitably to develop and apply these prin- 
ciples. With the recognition that there is no logical reason why 
Newtonian and other classical principles should be valid outside the 
domains in which they have been experimentally verified has come 
the realization that departures from these principles are indeed 
necessary. Such departures find their expression through the intro- 
duction of new mathematical formalisms, new schemes of axioms 
and rules of manipulation, into the methods of theoretical physics. 

Quantum mechanics providés a good example of the new ideas. It 


§5 MATHEMATICAL FORMULATION OF THE PRINCIPLE 15 


requires the states of a dynamical system and the dynamical variables 
to be interconnected in quite strange ways that are unintelligible 
from the classical standpoint. The states and dynamical variables 
have to be represented by mathematical quantities of different 
natures from those ordinarily used in physics. The new scheme 
becomes a precise physical theory when all the axioms and rules of 
manipulation governing the mathematical quantities are specified 
and when in addition certain laws are laid down connecting physical 
facts with the mathematical formalism, so that from any given 
physical conditions equations between the mathematical quantities 
may be inferred and vice versa. In an application of the theory one 
would be given certain physical information, which one would pro- 
ceed to express by equations between the mathematical quantities. 
One would then deduce new equations with the help of the axioms 
and rules of manipulation and would conclude by interpreting these 
new equations as physical conditions. The justification for the whole 
scheme depends, apart from internal consistency, on the agreement 
of the final results with experiment. 

We shall begin to set up the scheme by dealing with the mathe- 
matical relations between the states of a dynamical system at one 
instant of time, which relations will come from the mathematical 
formulation of the principle of superposition. The superposition pro- 
cess is a kind of additive process and implies that states can in some 
way be added to give new states. The states must therefore be con- 
nected with mathematical quantities of a kind which can be added 
together to give other quantities of the same kind. The most obvious 
of such quantities are vectors. Ordinary vectors, existing in a space 
of a finite number of dimensions, are not sufficiently general for 
most of the dynamical systems in quantum mechanics. We have to 
make a generalization to vectors in a space of an infinite number of 
dimensions, and the mathematical treatment becomes complicated 
by questions of convergence. For the present, however, we shall deal 
merely with some general properties of the vectors, properties which 
can be deduced on the basis of a simple scheme of axioms, and 
questions of convergence and related topics will not be gone into 
until the need arises. 

It is desirable to have a special name for describing the vectors 
which are connected with the states of a system in quantum mecha- 
nics, whether they are in a space of a finite or an infinite number of 
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dimensions. We shall call them ket vectors, or simply kets, and denote 
a general one of them by a special symbol |>. If we want to specify 
a particular one of them by a label, A say, we insert it in the middle, 
thus |A). The suitability of this notation will become clear as the 
scheme is developed. 

Ket vectors may be multiplied by complex numbers and may be 
added together to give other ket vectors, e.g. from two ket vectors 
|A> and |B> we can form 

¢, |A>+¢,|B> = |B, (1) 
say, where c, and c, are any two complex numbers. We may also 
perform more general linear processes with them, such as adding an 
infinite sequence of them, and if we have a ket vector |~>, depending 


on and labelled by a parameter x which can take on all values in a 
certain range, we may integrate it with respect to x, to get another 


ket vector f Be oy 


say. A ket vector which is expressible linearly in terms of certain 
others is said to be dependent on them. A set of ket vectors are called 
independent if no one of them is expressible linearly in terms of the 
others. 

We now assume that each state of a dynamical system at a particular 
time corresponds to a ket vector, the correspondence being such that if a 
state results from the superposition of certain other states, its correspond- 
ing ket vector is expressible linearly in terms of the corresponding ket 
vectors of the other states, and conversely. Thus the state R results from 
a superposition of the states A and B when the corresponding ket 
vectors are connected by (1). 

The above assumption leads to certain properties of the super- 
position process, properties which are in fact necessary for the word 
‘superposition’ to be appropriate. When two or more states are 
superposed, the order in which they occur in the superposition 
process is unimportant, so the superposition process is symmetrical 
between the states that are superposed. Again, we see from equation 
(1) that (excluding the case when the coefficient c, or c, is zero) if 
the state R can be formed by superposition of the states A and B, 
then the state A can be formed by superposition of B and R, and B 
can be formed by superposition of A and RK. The superposition 
relationship is symmetrical between all three states A, B, and R. 
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A state which results from the superposition of certain other 
states will be said to be dependent on those states. More generally, 
a state will be said to be dependent on any set of states, finite or 
infinite in number, if its corresponding ket vector is dependent on 
the corresponding ket vectors of the set of states. A set of states 
will be called independent if no one of them is dependent on the 
others. 

To proceed with the mathematical formulation of the superposition 
principle we must introduce a further assumption, namely the assump- 
tion that by superposing a state with itself we cannot form any new 
state, but only the original state over again. If the original state 
corresponds to the ket vector |A>, when it is superposed with itself 
the resulting state will correspond to 


c, |A>+¢, |A> = (€,+¢2)|A>, 


where c, and c, are numbers. Now we may have ¢,+-c, = 0, in which 
case the result of the superposition process would be nothing at all, 
the two components having cancelled each other by an interference 
effect. Our new assumption requires that, apart from this special 
case, the resulting state must be the same as the original one, so that 
(c;-+c,)|A> must correspond to the same state that |A> does. Now 
c,-+c, is an arbitrary complex number and hence we can conclude 
that if the ket vector corresponding to a state is multiplied by any 
complex number, not zero, the resulting ket vector will correspond to the 
same state. Thus a state is specified by the direction of a ket vector 
and any length one may assign to the ket vector is irrelevant. All 
the states of the dynamical system are in one-one correspondence 
with all the possible directions for a ket vector, no distinction being 
made between the directions of the ket vectors |A> and —|A). 
The assumption just made shows up very clearly the fundamental 
difference between the superposition of the quantum theory and any 
kind of classical superposition. In the case of a classical system for 
which a superposition principle holds, for instance a vibrating mem- 
brane, when one superposes a state with itself the result is a dufferent 
state, with a different magnitude of the oscillations. There is no 
physical characteristic of a quantum state corresponding to the 
magnitude of the classical oscillations, as distinct from their quality, 
described by the ratios of the amplitudes at different points of 
the membrane. Again, while there exists a classical state with zero 
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amplitude of oscillation everywhere, namely the state of rest, there 
does not exist any corresponding state for a quantum system, the 
zero ket vector corresponding to no state at all. 

Given two states corresponding to the ket vectors |A> and |B), 
the general state formed by superposing them corresponds to a ket 
vector |> which is determined by two complex numbers, namely 
the coefficients c, and c, of equation (1). If these two coefficients are 
multiplied by the same factor (itself a complex number), the ket 
vector |) will get multiplied by this factor and the corresponding 
state will be unaltered. Thus only the ratio of the two coefficients 
is effective in determining the state R. Hence this state is deter- 
mined by one complex number, or by two real parameters. Thus 
from two given states, a twofold infinity of states may be obtained 
by superposition. 

This result is confirmed by the examples discussed in §§ 2 and 3. 
In the example of § 2 there are just two independent states of polari- 
zation for a photon, which may be taken to be the states of plane 
_ polarization parallel and perpendicular to some fixed direction, and 
from the superposition of these two a twofold infinity of states of 
polarization can be obtained, namely all the states of elliptic polari- 
zation, the general one of which requires two parameters to describe 
it. Again, in the example of § 3, from the superposition of two given 
translational states for a photon a twofold infinity of translational 
states may be obtained, the general one of which is described by two 
parameters, which may be taken to be the ratio of the amplitudes 
of the two wave functions that are added together and their phase 
relationship. This confirmation shows the need for allowing complex 
coefficients in equation (1). If these coefficients were restricted to be 
real, then, since only their ratio is of importance for determining the 
direction of the resultant ket vector |R> when |A> and |B) are 
given, there would be only a simple infinity of states obtainable from 
the superposition. 


6. Bra and ket vectors 

Whenever we have a set of vectors in any mathematical theory, 
we can always set up a second set of vectors, which mathematicians 
call the dual vectors. The procedure will be described for the case 
when the original vectors are our ket vectors. 

Suppose we have a number ¢ which is a function of a ket vector 
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|A>, ie. to each“ ket vector |A> there corresponds one number 4, 
and suppose further that the function is a linear one, which means 
that the number corresponding to |A>+|A’> is the sum of the 
numbers corresponding to |A> and to |A’), and the number corre- 
sponding to c|A> is ¢ times the number corresponding to |A), c 
being any numerical factor. Then the number ¢ corresponding to 
any |A> may be looked upon as the scalar product of that |A> with 
some new vector, there being one of these new vectors for each linear 
function of the ket vectors |A>. The justification for this way of 
looking at ¢ is that, as will be seen later (see equations (5) and (6)), 
the new vectors may be added together and may be multiplied by 
numbers to give other vectors of the same kind. The new vectors 
are, of course, defined only to the extent that their scalar products 
with the original ket vectors are given numbers, but this is suffi- 
cient for one to be able to build up a mathematical theory about 
them. 

We shall call the new vectors bra vectors, or simply bras, and denote 
a general one of them by the symbol <|, the mirror image of the 
symbol for a ket vector. If we want to specify a particular one of 
them by a label, B say, we write it in the middle, thus <B|. The 
scalar product of a bra vector <B| and a ket vector |A)> will be 
written <B|A), i.e. as a juxtaposition of the symbols for the bra 
and ket vectors, that for the bra vector being on the left, and the 
two vertical lines being contracted to one for brevity. 

One may look upon the symbols < and >» as a distinctive kind of 
brackets. A scalar product <B|A)> now appears as a complete bracket 
expression and a bra vector <B| or a ket vector |A> as an incomplete 
bracket expression. We have the rules that any complete bracket 
expression denotes a number and any incomplete bracket expression 
denotes a vector, of the bra or ket kind according to whether tt contains 
the first or second part of the brackets. 

The condition that the scalar product of <B| and |A) is a linear 
function of |A> may be expressed symbolically by 


<BI{|A>+|A’>} = <BIA>+<BIA, (2) 
<B\{c|A>} = c<BlA), - (3) 


ce being any number. 
A bra vector is considered to be completely defined when its scalar 
product with every ket vector is given, so that if a bra vector has its 


20 THE PRINCIPLE OF SUPERPOSITION § 6 


scalar product with every ket vector vanishing, the bra vector itself 
must be considered as vanishing. In symbols, if 


then Kee: 

The sum of two bra vectors < B| and ¢B’| is defined by the condition 
that its scalar product with any ket vector |A> is the sum of the 
scalar products of <B] and <B’| with |A), 


{(B|+<¢B'}|A> = ¢<BlAD+<¢B'|A), (5) 
and the product of a bra vector <B| and a number c is defined by the 


condition that its scalar product with any ket vector |A) is c times 
the scalar product of <B| with |A)>, 


{c{ B|}|A> = c<B|A). (6) 


Equations (2) and (5) show that products of bra and ket vectors 
satisfy the distributive axiom of multiplication, and equations (3) 
and (6) show that multiplication by numerical factors satisfies the 
usual algebraic axioms. 

The bra vectors, as they have been here introduced, are quite a 
different kind of vector from the kets, and so far there is no connexion 
between them except for the existence of a scalar product of a bra 
and a ket. We now make the assumption that there is a one-one 
correspondence between the bras and the kets, such that the bra corre- 
sponding to |A>+-|A’> ts the sum of the bras corresponding to |A> and 
to |A'), and the bra corresponding to c|A> is € times the bra corre- 
sponding to |A>, & being the conjugate complex number to c. We shall 
use the same label to specify a ket and the corresponding bra. Thus 
the bra corresponding to |A> will be written <A|. 

The relationship between a ket vector and the corresponding bra 
makes it reasonable to call one of them the conjugate imaginary of 
the other. Our bra and ket vectors are complex quantities, since they 
can be multiplied by complex numbers and are then of the same 
nature as before, but they are complex quantities of a special kind 
which cannot be split up into real and pure imaginary parts. The 
usual method of getting the real part of a complex quantity, by 
taking half the sum of the quantity itself and its conjugate, cannot 
be applied since a bra and a ket vector are of different natures and 
cannot be added together. To call attention to this distinction, we 
shall use the words ‘conjugate complex’ to refer to numbers and 
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other complex quantities which can be split up into real and pure 
imaginary parts, and the words ‘conjugate imaginary’ for bra and 
ket vectors, which cannot. With the former kind of quantity, we 
shall use the notation of putting a bar over one of them to get the 
conjugate complex one. 

On account of the one-one correspondence between bra vectors and 
ket vectors, any state of our dynamical system at a particular time may 
be specified by the direction of a bra vector just as well as by the direction 
of a ket vector. In fact the whole theory will be symmetrical in its 
essentials between bras and kets. 

Given any two ket vectors |A> and |B), we can construct from 
them a number <B|A) by taking the scalar product of the first with 
the conjugate imaginary of the second. This number depends linearly 
on |A> and antilinearly on |B), the antilinear dependence meaning 
that the number formed from |B>+ |B’ is the sum of the numbers 
formed from |B) and from |B’), and the number formed from c|B)> 
is € times the number formed from |B). There is a second way in 
which we can construct a number which depends linearly on |A> and 
antilinearly on |B), namely by forming the scalar product of |B> 
with the conjugate imaginary of |A> and taking the conjugate com- 
plex of this scalar product. We assume that these two numbers are 


always equal, i.e. (BIA) = CA[BS. (7). 


Putting |B> = |A)> here, we find that the number ¢A|A> must be 
real. We make the further assumption 

<A|A> > 0, (8) 
except when {A> = 0. 

In ordinary space, from any two vectors one can construct a 
number—their scalar product—which is a real number and is sym- 
metrical between them. In the space of bra vectors or the space of 
ket vectors, from any two vectors one can again construct a number 
—the scalar product of one with the conjugate imaginary of the 
other—but this number is complex and goes over into the conjugate 
complex number when the two vectors are interchanged. There is 
thus a kind of perpendicularity in these spaces, which is a generaliza- 
tion of the perpendicularity in ordinary space. We shall call a bra 
and a ket vector orthogonal if their scalar product is zero, and two 
bras or two kets will be called orthogonal if the scalar product of one 
with the conjugate imaginary of the other is zero. Further, we shall 
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say that two states of our dynamical system are orthogonal if the 
vectors corresponding to these states are orthogonal. 

The length of a bra vector <A| or of the conjugate imaginary ket 
vector |A> is defined as the square root of the positive number 
<A|A>. When we are given a state and wish to set up a bra or ket 
vector to correspond to it, only the direction of the vector is given 
and the vector itself is undetermined to the extent of an arbitrary 
numerical factor. It is often convenient to choose this numerical 
factor so that the vector is of length unity. This procedure is called 
normalization and the vector so chosen is said to be normalized. 'The 
vector is not completely determined even then, since one can still 
multiply it by any number of modulus unity, i.e. any number e*” 
where y is real, without changing its length. We shall call such a 
number a phase factor. 

The foregoing assumptions give the complete scheme of relations 
between the states of a dynamical system at a particular time. The 
relations appear in mathematical form, but they imply physical 
conditions, which will lead to results expressible in terms of observa- 
tions when the theory is developed further. For instance, if two states 
are orthogonal, it means at present simply a certain equation in our 
formalism, but this equation implies a definite physical relationship — 
between the states, which further developments of the theory will 
. enable us to interpret in terms of observational results (see the 
bottom of p. 35). 
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7. Linear operators 

In the preceding section we considered a number which is a linear 
function of a ket vector, and this led to the concept of a bra vector. 
We shall now consider a ket vector which is a linear function of a 
ket vector, and this will lead to the concept of a linear operator. 

Suppose we have a ket |F) which is a function of a ket |A>, ie. 
to each ket |A> there corresponds one ket |F), and suppose further 
that the function is a linear one, which means that the |F') corre- 
sponding to |A>-+-|A’> is the sum of the |F’s corresponding to |A) 
and to |A’>, and the |F corresponding to c|.A) is c times the |F) 
corresponding to |A)>, c being any numerical factor. Under these 
conditions, we may look upon the passage from |A> to |F> as the 
application of a linear operator to |A>. Introducing the symbol a 
for the linear operator, we may write 
in which the result of « operating on |A> is written like a product 
of « with |A>. We make the rule that in such products the ket vector 
must always be put on the right of the linear operator. The above 
conditions of linearity may now be expressed by the equations 

of{|A>-+14’>} = a|4>-+al4’), a 
af{c|.A >} = ca|A)>. 

A linear operator is considered to be completely defined when the 
result of its application to every ket vector is given. Thus a linear 
operator is to be considered zero if the result of its application to every 
ket vanishes, and two linear operators are to be considered equal if 
they produce the same result when applied to every ket. 

Linear operators can be added together, the sum of two linear 
operators being defined to be that linear operator which, operating 
on any ket, produces the sum of what the two linear operators 
separately would produce. Thus «+8 is defined by 


{a+B}|A> = alA>+B|A> (2) 
for any |A>. Equation (2) and the first of equations (1) show that 
products of linear operators with ket vectors satisfy the distributive 
axiom of multiplication. 
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Linear operators can also be multiplied together, the product of 
two linear operators being defined as that linear operator, the appli- 
cation of which to any ket produces the same result as the application 
of the two linear operators successively. Thus the product af is 
defined as the linear operator which, operating on any ket |A), 
changes it into that ket which one would get by operating first on 
|A> with f, and then on the result of the first operation with a. In 


aan faB}|4) = af8|A)}. 


This definition appears as the associative axiom of multiplication for 
the triple product of «, 8, and |A>, and allows us to write this triple 
product as «8|A> without brackets. However, this triple product is 
in general not the same as what we should get if we operated on |A> 
first with « and then with 8, i.e. in general «8|A> differs from Ba|A), 
so that in general of must differ from Ba. The commutative axiom of 
multiplication does not hold for linear operators. It may happen as a 
special case that two linear operators € and 7 are such that én and 
né are equal. In this case we say that € commutes with n, or that & 
and 7 commute. 

By repeated applications of the above processes of adding and 
multiplying linear operators, one can form sums and products of © 
more than two of them, and one can proceed to build up an algebra 
with them. In this algebra the commutative axiom of multiplication 
does not hold, and also the product of two linear operators may 
vanish without either factor vanishing. But all the other axioms of 
ordinary algebra, including the associative and distributive axioms 
of multiplication, are valid, as may easily be verified. 

If we take a number & and multiply it into ket vectors, it appears 
as a linear operator operating on ket vectors, the conditions (1) being 
fulfilled with & substituted for «. A number is thus a special vase of 
a linear operator. It has the property that it commutes with all linear 
operators and this property distinguishes it from a general linear 
operator. 

So far we have considered linear operators operating only on ket 
vectors. We can give a meaning to their operating also on bra vectors, 
in the following way. Take the scalar product of any bra <B] with 
the ket «|A>. This scalar product is a number which depends 
linearly on |A> and therefore, from the definition of bras, it may be 
considered as the scalar product of |A> with some bra. The bra thus 
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defined depends linearly on < B|, so we may look upon it as the result of 
some linear operator applied to <B|. This linear operator is uniquely - 
determined by the original linear operator « and may reasonably be 
called the same linear operator operating on a bra. In this way our 
linear operators are made capable of operating on bra vectors. 

A suitable notation to use for the resulting bra when « operates on 
the bra <B| is <B|a«, as in this notation the equation which defines 


Blai 
we {<Bla}|A) — <BifalAy} (3) 


for any |A>, which simply expresses the associative axiom of multi- 
plication for the triple product of <B|, «, and |A>. We therefore 
make the general rule that in a product of a bra and a linear operator, 
the bra must always be put on the left. We can now write the triple 
product of <B], «, and |A> simply as <B|a|A> without brackets. It 
may easily be verified that the distributive axiom of multiplication 
holds for products of bras and linear operators just as well as for 
products of linear operators and kets. 

There is one further kind of product which has a meaning in our 
scheme, namely the product of a ket vector and a bra vector with 
the ket on the left, such as |A><B|. To examine this product, let us 
multiply it into an arbitrary ket |P>, putting the ket on the right, 
and assume the associative axiom of multiplication. The product is 
then |A)<B|P), which is another ket, namely |A> multiplied by the 
number <B|P), and this ket depends linearly on the ket |P>. Thus 
|4)<B| appears as a linear operator that can operate on kets. It 
can also operate on bras, its product with a bra ¢Q| on the left being 
<Q|A><B|, which is the number <Q|A> times the bra <B\|. The 
product |A><B| is to be sharply distinguished from the product 
<B\A)> of the same factors in the reverse order, the latter product 
being, of course, a number. 

We now have a complete algebraic scheme involving three kinds 
of quantities, bra vectors, ket vectors, and linear operators. They can 
be multiplied together in the various ways discussed above, and the 
associative and distributive axioms of multiplication always hold, 
but the commutative axiom of multiplication does not hold. In this 
general scheme we still have the rules of notation of the preceding 
section, that any complete bracket expression, containing ¢ on the 
left and > on the right, denotes a number, while any incomplete 
bracket expression, containing only ¢ or >, denotes a vector. 
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With regard to the physical significance of the scheme, we have 
already assumed that the bra vectors and ket vectors, or rather the 
directions of these vectors, correspond to the states of a dynamical 
system at a particular time. We now make the further assumption 
that the linear operators correspond to the dynamical variables at that 
time. By dynamical variables are meant quantities such as the 
coordinates and the components of velocity, momentum and angular 
momentum of particles, and functions of these quantities—in fact 
the variables in terms of which classical mechanics is built up. The 
new assumption requires that these quantities shall occur also in 
quantum mechanics, but with the striking difference that they are 
now subject to an algebra in which the commutative axiom of multiplica- 
tion does not hold. 

This different algebra for the dynamical variables is one of the 
most important ways in which quantum mechanics differs from 
classical mechanics. We shall see later on that, in spite of this funda- 
mental difference, the dynamical variables of quantum mechanics 
still have many properties in common with their classical counter- 
parts and it will be possible to build up a theory of them closely 
analogous to the classical theory and forming a beautiful generaliza~- 
tion of it. 

It is convenient to use the same letter to denote a dynamical 
variable and the corresponding linear operator. In fact, we may con- 
sider a dynamical variable and the corresponding linear operator to 
be both the same thing, without getting into confusion. 


8. Conjugate relations 

Our linear operators are complex quantities, since one can multiply 
them by complex numbers and get other quantities of the same nature. 
Hence they must correspond in general to complex dynamical vari- 
ables, i.e. to complex functions of the coordinates, velocities, etc. We 
need some further development of the theory to see what kind of 
linear operator corresponds to a real dynamical variable. 

Consider the ket which is the conjugate imaginary of (Pla. This 
ket depends antilinearly on <P| and thus depends linearly on |P). 
It may therefore be considered as the result of some linear operator 
operating on |P>. This linear operator is called the adjoint of « and 
we shall denote it by &. With this notation, the conjugate imaginary 
of <P\a is «|P»>. 
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In formula (7) of Chapter I put «Pla for <A] and its conjugate 
imaginary a|P> for |A>. The result is 
<Bla|P> = <Pla|B)>. (4) 
This is a general formula holding for any ket vectors |B), |P> and 
any linear operator «, and it expresses one of the most frequently 
used properties of the adjoint. 
Putting « for « in (4), we get 
<Bl\a|P> = <P\a|B> = <Bla|P), 
by using (4) again with |P)> and |B) interchanged. This holds for 
any ket |P)>, so we can infer from (4) of Chapter I, 
(Bla = <Bla, 
and since this holds for any bra vector <B|, we can infer 


a= a, 

Thus the adjoint of the adjoint of a linear operator is the original linear 
operator. This property of the adjoint makes it like the conjugate 
complex of a number, and it is easily verified that in the special case 
when the linear operator is a number, the adjoint linear operator is 
the conjugate complex number. Thus it is reasonable to assume that 
the adjoint of a linear operator corresponds to the conjugate complex of 
a dynamical variable. With this physical significance for the adjoint 
of a linear operator, we may call the adjoint alternatively the con- 
jugate complex linear operator, which conforms with our notation a. 

A linear operator may equal its adjoint, and is then called self- 
adjoint. It corresponds to a real dynamical variable, so it may be 
called alternatively a real linear operator. Any linear operator may 
be split up into a real part and a pure imaginary part. For this 
reason the words ‘conjugate complex’ are applicable to linear 
operators and not the words ‘conjugate imaginary’. 

The conjugate complex of the sum of two linear operators is 
obviously the sum of their conjugate complexes. To get the conjugate 
complex of the product of two linear operators and B, we apply 
formula (7) of Chapter I with 


(Al = <Plo, <B) = <Qip, 
so that |4>=alP>, |B = BQ. 


The result is 


<Q|pa|P> = <PlaB|Q> = <Q\aB|P> 
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from (4). Since this holds for any |P> and <Q|, we can infer that 
Ba = of. (5) 
Thus the conjugate complex of the product of two linear operators equals 
the product of the conjugate complexes of the factors in the reverse order. 
As simple examples of this result, it should be noted that, if € and 
n are real, in general £y is not real. This is an important difference 
from classical mechanics. However, &y-+ n€ is real, and so is 1(£m— 7). 
Only when é and 7 commute is £7 itself also real. Further, if £ is real, 
then so is €* and, more generally, €" with n any positive integer. 
We may get the conjugate complex of the product of three linear 
operators by successive applications of the rule (5) for the conjugate 
complex of the product of two of them. We have 


oBy = a(By) = Bya = 7Ba, (6) 
so the conjugate complex of the product of three linear operators 
equals the product of the conjugate complexes of the factors in the 
reverse order. The rule may easily be extended to the product of any 
number of linear operators. 

In the preceding section we saw that the product |A)<B| is a linear 
operator. We may get its conjugate complex by referring directly to 
the definition of the adjoint. Multiplying |A)<B| into a general bra 
<P| we get <P|A><B|, whose conjugate imaginary ket is 

<P|A>|B> = <A|P>|B> = |B><A|P>. 
Hence |A><B| = |B><A|. (7), 

We now have several rules concerning conjugate complexes and 
conjugate imaginaries of products, namely equation (7) of Chapter I, 
equations (4), (5), (6), (7) of this chapter, and the rule that the 
conjugate imaginary of <P|ais a|P>. These rules can all be summed 
up in a single comprehensive rule, the conjugate complex or conjugate 
imaginary of any product of bra vectors, ket vectors, and linear operators 
is obtained by taking the conjugate complex or conjugate imaginary of 
each factor and reversing the order of all the factors. The rule is easily 
verified to hold quite generally, also for the cases not explicitly given 
above. 

THroreM. If é is a real linear operator and 

é"|P> = 0 (8) 
for a particular ket |P>, m being a positive integer, then 
é|P> = 0. 
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To prove the theorem, take first the case when m = 2. Equation 
(8) then gives <P|é|P) = 0 


showing that the ket £|P> multiplied by the conjugate imaginary bra 

<P|€ is zero. From the assumption (8) of Chapter I with ¢|P» for |), 

we see that £|P> must be zero. Thus the theorem is proved form = 2. 
Now take m > 2 and put 


gm? P> = |Q>. 
Equation (8) now gives L4O)y=—. 0: 
Applying the theorem for m = 2, we get 
élQ> = 
or aap) = 0. . (9) 
By repeating the process by which equation (9) is obtained from 
(8), we obtain successively 


amad 37 = 0, Seni ie =0, ..., one = 0, é|P> = 0, 
and so the theorem is proved generally. 


9. Eigenvalues and eigenvectors 
We must make a further development of the theory of linear 
operators, consisting in studying the equation 
ar = al\P>, (10) 
where « is a linear operator and a isa number. This equation usually 
presents itself in the form that « is a known linear operator and the 
number a and the ket |P> are unknowns, which we have to try to 
choose so as to satisfy (10), ignoring the trivial solution |P> = 0. 
Equation (10) means that the linear operator « applied to the ket 
|P> just multiplies this ket by a numerical factor without changing 
its direction, or else multiplies it by the factor zero, so that it ceases 
to have a direction. This same « applied to other kets will, of course, 
in general change both their lengths and their directions. It should 
be noticed that only the direction of | P> is of importance in equation 
(10). If one multiplies |P> by any number not zero, it will not affect 
the question of whether (10) is satisfied or not. 
Together with equation (10), we should consider also the conjugate 
imaginary form of equation 
Qla = b<QI, (11) 


where 6 is a number. Here the unknowns are the number 6 and the 
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non-zero bra <Q]. Equations (10) and (11) are of such fundamental 
importance in the theory that it is desirable to have some special 
words to describe the relationships between the quantities involved. 
If (10) is satisfied, we shall call a an eigenvaluet of the linear operator 
«, or of the corresponding dynamical variable, and we shall call |P> 
an eigenket of the linear operator or dynamical variable. Further, we 
shall say that the eigenket | P)> belongs to the eigenvalue a. Similarly, 
if (11) is satisfied, we shall call b an eigenvalue of « and ¢(Q| an 
eigenbra belonging to this eigenvalue. The words eigenvalue, eigen- 
ket, eigenbra have a meaning, of course, only with reference to a linear 
operator or dynamical variable. 

Using this terminology, we can assert that, if an eigenket of « is 
multiplied by any number not zero, the resulting ket is also an 
eigenket and belongs to the same eigenvalue as the original one. 
It is possible to have two or more independent eigenkets of a linear 
operator belonging to the same eigenvalue of that linear operator, 
e.g. equation (10) may have several solutions, |P1>, |P2), |P3),... say, 
all holding for the same value of a, with the various eigenkets |P1), 
|P2>, |P3),... independent. In this case it is evident that any linear 
combination of the eigenkets is another eigenket belonging to the 
same eigenvalue of the linear operator, e.g. 

C1 |P1>+¢, | P2>+¢3 |P3>+... 
is another solution of (10), where c,,c,,¢3,... are any numbers. 

In the special case when the linear operator « of equations (10) and 
(11) is a number, & say, it is obvious that any ket |P> and bra <Q| 
will satisfy these equations provided a and b equal k. Thus a number 
considered as a linear operator has just one eigenvalue, and any ket 
is an eigenket and any bra is an eigenbra, belonging to this eigenvalue. 

The theory of eigenvalues and eigenvectors of a linear operator « 
which is not real is not of much use for quantum mechanics. We 
shall therefore confine ourselves to real linear operators for the further 
development of the theory. Putting for « the real linear operator ¢, 
we have instead of equations (10) and (11) 


ey aPy (12) 
<QIE = 5<QI. (13) 
+ The word ‘proper’ is sometimes used instead of ‘eigen’, but this is not satisfactory 


as the words ‘proper’ and ‘improper’ are often used with other meanings. For example, 
in §$ 15 and 46 the words ‘improper function’ and ‘proper-energy’ are used. 
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Three important results can now be readily deduced. 
(i) The eigenvalues are all real numbers. To prove that a satisfying 
(12) is real, we multiply (12) by the bra <P| on the left, obtaining 


<P\E|P> = acP|P). 


Now from equation (4) with <B| replaced by <P| and « replaced by 
the real linear operator £, we see that the number <P|é|P)> must be 
real, and from (8) of § 6, <P|P> must be real and not zero. Hence a 
is real. Similarly, by multiplying (13) by |Q> on the right, we can 
prove that 6 is real. 

Suppose we have a solution of (12) and we form the conjugate 
imaginary equation, which will read 


(Ple= acP| 


in view of the reality of € and a. This conjugate imaginary equation 
now provides a solution of (13), with <Q| = <P| and b =a. Thus 
we can infer 

(ii) The eigenvalues associated with eigenkets are the same as the 
eigenvalues associated with eigenbras. 

(iii) The conjugate imaginary of any eigenket is an eigenbra belonging 
to the same eigenvalue, and conversely. This last result makes it reason- 
able to call the state corresponding to any eigenket or to the conjugate 
imaginary eigenbra an eigenstate of the real dynamical variable €. 

Eigenvalues and eigenvectors of various real dynamical variables 
are used very extensively in quantum mechanics, so it is desirable 
to have some systematic notation for labelling them. The following 
is suitable for most purposes. If € is a real dynamical variable, we 
call its eigenvalues ¢’, é”, €", etc. Thus we have a letter by itself 
denoting a real dynamical variable or a real linear operator, and the 
same letter with primes or an index attached denoting a number, 
namely an eigenvalue of what the letter by itself denotes. An eigen- 
vector may now be labelled by the eigenvalue to which it belongs. 
Thus |¢’> denotes an eigenket belonging to the eigenvalue ¢’ of the 
dynamical variable ¢. If in a piece of work we deal with more than 
one eigenket belonging to the same eigenvalue of a dynamical variable, 
we may distinguish them one from another by means of a further 
label, or possibly of more than one further labels. Thus, if we are 
dealing with two eigenkets belonging to the same eigenvalue of ue 
we may call them |g’1> and |£‘2). 
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THEOREM. T'wo eigenvectors of a real dynamical variable belonging 
to different eigenvalues are orthogonal. 

To prove the theorem, let |é’> and |é”> be two eigenkets of the real 
dynamical variable é, belonging to the eigenvalues ¢’ and £” respec- 
tively. Then we have the equations 


Ele) = E12, (14) 

EE") = E"é". (15) 
Taking the conjugate imaginary of (14), we get 

CEE = KE". 


Multiplying this by |&”> on the right gives 

Er EIE’> = °<e" 16") 
and multiplying (15) by <é’| on the left gives 

EEE" = EEE"). 
Hence, subtracting, (€’—&")<é'|é") = 0, (16) 
showing that, if £’ 4 £”, <é’|é”> = 0 and the two eigenvectors |£’> 
and |€”> are orthogonal. This theorem will be referred to as the 
orthogonality theorem. 

We have been discussing properties of the eigenvalues and eigen- 
vectors of a real linear operator, but have not yet considered the 
question of whether, for a given real linear operator, any eigenvalues 
and eigenvectors exist, and if so, how to find them. This question 
is in general very difficult to answer. There is one useful special case, 
however, which is quite tractable, namely when the real linear 
operator, € say, satisfies an algebraic equation 


P(E) = "+a, E" 41 +a, "2 +....+.a, = 0, (17) 
the coefficients a being numbers. This equation means, of course, 


that the linear operator $(£) produces the result zero when applied 
to any ket vector or to any bra vector. 


Let (17) be the simplest algebraic equation that £ satisfies. Then 
it will be shown that ; 


(«) The number of eigenvalues of € is n. 


(8) There are so many eigenkets of € that any ket whatever can 
be expressed as a sum of such eigenkets. 


The algebraic form ¢(€) can be factorized into n linear factors, the 


result Being $(€) = (E-e,)(€—c4)(E—o4)..(—c,) (18) 
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say, the c’s being numbers, not assumed to be all different. This 
factorization can be performed with & a linear operator just as well 
as with € an ordinary algebraic variable, since there is nothing 
occurring in (18) that does not commute with ¢. Let the quotient 
when ¢(£) is divided by (—c,) be x,(€), so that 

#(€) == (E—epnglé) (r SS 1, 2, Cyan n). 
Then, for any ket |P), 


(—¢,)x-(€)|P> = $(€)|P> = 9. (19) 
Now x,(é)|P> cannot vanish for every ket |P>, as otherwise ,(€) 
itself would vanish and we should have & satisfying an algebraic 
equation of degree n—1, which would contradict the assumption that 
(17) is the simplest equation that £ satisfies. If we choose |P)> so that 
x,(€)|P> does not vanish, then equation (19) shows that y,(£)|P> is 
an eigenket of €, belonging to the eigenvalue c,. The argument holds 
for each value of r from 1 to n, and hence each of the c’s is an eigen- 
value of ¢. No other number can be an eigenvalue of €, since if £’ is 
any eigenvalue, belonging to an eigenket |£’>, 

SIE > = E18 

and we can deduce —¢(£)|¢”> = (é’)IE; 
and since the left-hand side vanishes we must have ¢(¢’) = 0. 

To complete the proof of (~) we must verify that the c’s are all 
different. Suppose the c’s are not all different and c, occurs m times 
say, with m > 1. Then ¢(é) is of the form 

HE) = (E—c,)"01E), 
with 6(¢) a rational integral function of £. Equation (17) now gives us 

(€—c,)"0(€)|A> = 0 (20) 
for any ket |A>. Since c, is an eigenvalue of é it must be real, so that 
é—c, is a real linear operator. Equation (20) is now of the same form 
as equation (8) with é—c, for and 6(£)|A> for |P>. From the theorem 
connected with equation (8) we can infer that 

(€—¢,)0(£)|A> = 0. 
Since the ket [A> is arbitrary, 

(E—c,)0(€) = 0, 

which contradicts the &ssumption that (17) is the simplest equation 


that ¢ satisfies. Hence the c’s are all different and (a) is proved. 
Let x,(c,) be the number obtained when ¢, is substituted for € in 
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the algebraic expression x,(€). Since the c’s are all different, y,(c,) 
cannot vanish. Consider now the expression 


XelS) (21) 
= Xr(Cr) 
If c, is substituted for € here, every term in the sum vanishes except 
the one for which r = s, since x,() contains (—c,) as a factor when 
r 8, and the term for which r = s is unity, so the whole expression 
vanishes. Thus the expression (21) vanishes when € is put equal to 
- any of the » numbers C1,Cg,...,C,. Since, however, the expression 
is only of degree n—1 in &, it must vanish identically. If we now 
apply the linear operator (21) to an arbitrary ket |P> and equate 
the result to zero, we get 
1 


IP> = Day w@llP>- (22) 


Each term in the sum on the right here is, according to (19), an 
eigenket of €, if it does not vanish. Equation (22) thus expresses the 
arbitrary ket |P> as a sum of eigenkets of £, and thus (f) is proved. 

As a simple example we may consider a real linear operator o that 
satisfies the equation ne (23) 


Then o has the two eigenvalues 1 and —1. Any ket |P) can be 
ompressed #8 ||P) =4(0-+40)|P)-+4(0—0) P). 
It is easily verified that the two terms on the right here are eigenkets 


of o, belonging to the eigenvalues 1 and —1 respectively, when they 
do not vanish. 


10. Observables 

We have made a number of assumptions about the way in which 
states and dynamical variables are to be represented mathematically 
in the theory. These assumptions are not, by themselves, laws of 
nature, but become laws of nature when we make some further 
assumptions that provide a physical interpretation of the theory. 
Such further assumptions must take the form of establishing con- 
nexions between the results of observations, on one hand, and the 
equations of the mathematical formalism on the other. 

When we make an observation we measure some dynamical variable. 
It is obvious physically that the result of such a measurement must 
always be a real number, so we should expect that any dynamical 
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variable that we can measure must be a real dynamical variable. 
One might think one could measure a complex dynamical variable 
by measuring separately its real and pure imaginary parts. But this 
would involve two measurements or two observations, which would 
be all right in classical mechanics, but would not do in quantum 
mechanics, where two observations in general interfere with one 
another—it is not in general permissible to consider that two observa- 
tions can be made exactly simultaneously, and if they are made in 
quick succession the first will usually disturb the state of the system 
and introduce an indeterminacy that will affect the second. We 
therefore have to restrict the dynamical variables that we can 
measure to be real, the condition for this in quantum mechanics 
being as given in §8. Not every real dynamical variable can be 
measured, however. A further restriction is needed, as we shall see 
later. 

We now make some assumptions for the physical interpretation of 
the theory. If the dynamical system is in an eigenstate of a real 
dynamical variable &, belonging to the eigenvalue £', then a measurement 
of € will certainly give as result the number é’. Conversely, if the system 
is in @ state such that a measurement of a real dynamical variable € 1s 
certain to give one particular result (instead of giving one or other of 
several possible results according to a probability law, as is in general 
the case), then the state is an eigenstate of é and the result of the measure- 
ment is the eigenvalue of & to which this eigenstate belongs. These 
assumptions are reasonable on account of the eigenvalues of real 
linear operators being always real numbers. 

Some of the immediate consequences of the assumptions will be 
noted. If we have two or more eigenstates of a real dynamical 
variable ¢ belonging to the same eigenvalue ¢’, then any state 
formed by superposition of them will also be an eigenstate oie 
belonging to the eigenvalue ¢’. We can infer that if we have two or 
more states for which a measurement of € is certain to give the result 
¢’, then for any state formed by superposition of them a measurement 
of ¢ will still be certain to give the result ¢’. This gives us some insight 
into the physical significance of superposition of states. Again, two 
eigenstates of £ belonging to different eigenvalues are orthogonal. 
We can infer that two states for which a measurement of é is certain 
to give two different results are orthogonal. This gives us some 
insight into the physical significance of orthogonal states. 
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When we measure a real dynamical variable €, the disturbance 
involved in the act of measurement causes a jump in the state of the 
dynamical system. From physical continuity, if we make a second 
measurement of the same dynamical variable £ immediately after 
the first, the result of the second measurement must be the same as 
that of the first. Thus after the first measurement has been made, 
there is no indeterminacy in the result of the second. Hence, after 
the first measurement has been made, the system is in an eigenstate 
of the dynamical variable £, the eigenvalue it belongs to being equal 
to the result of the first measurement. This conclusion must still hold 
if the second measurement is not actually made. In this way we see 
that a measurement always causes the system to jump into an eigen- 
state of the dynamical variable that is being measured, the eigenvalue 
this eigenstate belongs to being equal to the result of the measure- 
ment. 

We can infer that, with the dynamical system in any state, any 
result of a measurement of a real dynamical variable is one of its eigen- 
values. Conversely, every eigenvalue is a possible result of a measure- 
ment of the dynamical variable for some state of the system, since it is 
certainly the result if the state is an eigenstate belonging to this 
eigenvalue. This gives us the physical significance of eigenvalues. 
The set of eigenvalues of a real dynamical variable are just the 
possible results of measurements of that dynamical variable and the 
calculation of eigenvalues is for this reason an important problem. 

Another assumption we make connected with the physical inter- 
pretation of the theory is that, if a certain real dynamical variable 
é 1s measured with the system in a particular state, the states into which 
the system may jump on account of the measurement are such that the 
original state 1s dependent on them. Now these states into which 
the system may jump are all eigenstates of €, and hence the original 
state is dependent on eigenstates of . But the original state may be 
any state, so we can conclude that any state is dependent on eigen- 
states of ¢. If we define a complete set of states to be a set such that 
any state is dependent on them, then our conclusion can be formu- 
lated—the eigenstates of & form a complete set. 

Not every real dynamical variable has sufficient eigenstates to form 
a complete set. Those whose eigenstates do not form complete sets 
are not quantities that can be measured. We obtain in this way a 
further condition that a dynamical variable has to satisfy in order 
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that it shall be susceptible to measurement, in addition to the con- 
dition that it shall be real. We call a real dynamical variable whose 
eigenstates form a complete set an observable. Thus any quantity 
that can be measured is an observable. 

The question now presents itself—Can every observable be 
measured? The answer theoretically is yes. In practice it may he 
very awkward, or perhaps even beyond the ingenuity of the experi- 
menter, to devise an apparatus which could measure some particular 
observable, but the theory always allows one to imagine that the 
measurement can be made. 

Let us examine mathematically the condition for a real dynamical 
variable € to be an observable. Its eigenvalues may consist of a 
(finite or infinite) discrete set of numbers, or alternatively, they 
may consist of all numbers in a certain range, such as all numbers 
lying between a and b. In the former case, the condition that 
any state is dependent on eigenstates of ¢ is that any ket can 
be expressed as a sum of eigenkets of ¢. In the latter case the 
condition needs modification, since one may have an integral instead 
of a sum, i.e. a ket |P> may be expressible as an integral of eigen- 


— P> = f le>ae, (24) 


|é’> being an eigenket of ¢ belonging to the eigenvalue é’ and the 
range of integration being the range of eigenvalues, as such a ket is 
dependent on eigenkets of ¢. Not every ket dependent on eigenkets 
of £ can be expressed in the form of the right-hand side of (24), since 
one of the eigenkets itself cannot, and more generally any sum of 
eigenkets cannot. The condition for the eigenstates of € to form a 
complete set must thus be formulated, that any ket |P)> can be 
expressed as an integral plus a sum of eigenkets of €, i.e. 


|P> = | Ise dé + 3 ied), (25) 


where the |¢’c), |&d> are all eigenkets of £, the labels c and d being 
inserted to distinguish them when the eigenvalues ¢’ and &* are equal, 
and where the integral is taken over the whole range of eigenvalues 
and the sum is taken over any selection of them. If this condition 
is satisfied in the case when the eigenvalues of € consist of a range 
of numbers, then é is an observable. 

There is a more general case that sometimes occurs, namely the 


eigenvalues of § may consist of a range of numbers together with a 
3695.57 D 
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discrete set of numbers lying outside the range. In this case the 
condition that ¢ shall be an observable is still that any ket shall be 
expressible in the form of the right-hand side of (25), but the sum 
over 7 is now a sum over the discrete set of eigenvalues as well as a 
selection of those in the range. 

It is often very difficult to decide mathematically whether a par- 
ticular real dynamical variable satisfies the condition for being an 
observable or not, because the whole problem of finding eigenvalues 
and eigenvectors is in general very difficult. However, we may have 
good reason on experimental grounds for believing that the dynamical 
variable can be measured and then we may reasonably assume that it 
is an observable even though the mathematical proofis missing. This is 
a thing we shall frequently do during the course of development of the 
theory, e.g. we shall assume the energy of any dynamical system to be 
always an observable, even though it is beyond the power of present- 
day mathematical analysis to prove it so except in simple cases. 

In the special case when the real dynamical variable is a number, 
every state is an eigenstate and the dynamical variable is obviously 
an observable. Any measurement of it always gives the same result, 
so it is just a physical constant, like the charge on an electron. 
A physical constant in quantum mechanics may thus be looked upon 
either as an observable with a single eigenvalue or as a mere number 
appearing in the equations, the two points of view being equivalent. 

If the real dynamical variable satisfies an algebraic equation, then 
the result (8) of the preceding section shows that the dynamical 
variable is an observable. Such an observable has a finite number 
of eigenvalues. Conversely, any observable with a finite number of 
eigenvalues satisfies an algebraic equation, since if the observable é 
has as its eigenvalues é’, é”,...,€", then 


(E—€')(€—€")...(E—&") |P> = 0 
holds for |P> any eigenket of £, and thus it holds for any |P) what- 


ever, because any ket can be expressed as a sum of eigenkets of ¢ 
on account of £ being an observable. Hence 


(E—£')(E—€")...(E—8") = 0. (26) 
As an example we may consider the linear operator |A)<A|, where 


|A> is a normalized ket. This linear operator is real according to (7), 
and its square is 


{1A>¢A |}? = |A><A|A><A| = [A><A| (27) 


a 
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since <A|A> = 1. Thus its square equals itself and so it satisfies an 
algebraic equation and is an observable. Its eigenvalues are 1 and 0, 
with |A> as the eigenket belonging to the eigenvalue 1 and all kets 
orthogonal to |A> as eigenkets belonging to the eigenvalue 0. A 
measurement of the observable thus certainly gives the result 1 if 
the dynamical system is in the state corresponding to |A> and the 
result 0 if the system is in any orthogonal state, so the observable 
may be described as the quantity which determines whether the 
system is in the state |A) or not. 

Before concluding this section we should examine the conditions 
for an integra] such as occurs in (24) to be significant. Suppose |X) 
and | YP) are two kets which can be expressed as integrals of eigenkets 
of the observable é, 


IX>= f le'=>dg’, — 1¥> = [ letyy ae, 


x and y being used as labels to distinguish the two integrands. Then 
we have, taking the conjugate imaginary of the first equation and 
multiplying by the second 


<E|Y> = ff <E'xlé"y> ag'ae’. (28) 
Consider now the single integral 
f <exiery> ae. , (29) 


From the orthogonality theorem, the integrand here must vanish 
over the whole range of integration except the one point é” = ¢’. 
If the integrand is finite at this point, the integral (29) vanishes, and 
if this holds for all ¢’, we get from (28) that <X| Y) vanishes. Now 
in general <X|Y> does not vanish, so in general <é’x|£’y> must be 
infinitely great in such a way as to make (29) non-vanishing and 
finite. The form of infinity required for this will be discussed in § 15. 

In our work up to the present it has been implied that our bra and 
ket vectors are of finite length and their scalar products are finite. 
We see now the need for relaxing this condition when we are dealing 
with eigenvectors of an observable whose eigenvalues form a range. 
If we did not relax it, the phenomenon of ranges of eigenvalues could 
not occur and our theory would be too weak for most practical 
problems. 
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Taking | Y> = |X) above, we get the result that in general <¢'2|€'x) 
is infinitely great. We shall assume that if |€’7) 4 0 


J <érwlgta> agr > 0, (30) 


as the axiom corresponding to (8) of $6 for vectors of infinite 
length. 

The space of bra or ket vectors when the vectors are restricted to 
be of finite length and to have finite scalar products is called by 
mathematicians a Hilbert space. The bra and ket vectors that we 
now use form a more general space than a Hilbert space. 

We can now see that the expansion of a ket |P> in the form of the 
right-hand side of (25) is unique, provided there are not two or more 
terms in the sum referring to the same eigenvalue. To prove this 
result, let us suppose that two different expansions of |P> are pos- 
sible. Then by subtracting one from the other, we get an equation 


of the form — f lea) dé’ + p> |e), (31) 


a and 6 being used as new labels for the eigenvectors, and the sum 
over s including all terms left after the subtraction of one sum from 
the other. If there is a term in the sum in (31) referring to an eigen- 
value @ not in the range, we get, by multiplying (31) on the left by 
<&b| and using the orthogonality theorem, 
0 = <eb|&>, 

which contradicts (8) of § 6. Again, if the integrand in (31) does not 
vanish for some eigenvalue é" not equal to any €° occurring in the 
sum, we get, by multiplying (31) on the left by <é"a| and using the 
orthogonality theorem, 

O= | <"alg’a> ae’, 
which contradicts (30). Finally, if there is a term in the sum in (31) 


referring to an eigenvalue ¢' in the range, we get, multiplying (31) on 
the left by <&b|, 


e 


0 = { <Gb|é'a> dé’ +-<eb|gb) (32) 
and multiplying (31) on the left by <@a| 
0 = { alg’ay de’ + Caley. (33) 


Now the integral in (33) is finite, so <@a|@b> is finite and <@b|&a) is 
finite. The integral in (32) must then be zero, so <@b|éb> is zero and 
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we again have a contradiction. Thus every term in (31) must vanish 


and the expansion of a ket |P)> in the form of the right-hand side of 
(25) must be unique. 


11. Functions of observables 


Let € be an observable. We can multiply it by any real number k 
and get another observable ké. In order that our theory may be 
self-consistent it is necessary that, when the system is in a state such 
that a measurement of the observable € certainly gives the result ¢’, 
a measurement of the observable ké shall certainly give the result ké’. 
It is easily verified that this condition is fulfilled. The ket correspond- 
ing to a state for which a measurement of & certainly gives the result 
é’ is an eigenket of €, jf’ say, satisfying 


ae — ee >- 


he\e"> = ke" |€"), 

showing that |£’> is an eigenket of ké belonging to the eigenvalue ké’, 
and thus that a measurement of ké will certainly give the result ké’. 

More generally, we may take any real function of €, f(€) say, and 
consider it as a new observable which is automatically measured 
whenever € is measured, since an experimental determination of the 
value of ¢ also provides the value of f(€). We need not restrict f(€) to 
be real, and then its real and pure imaginary parts are two observables 
which are automatically measured when € is measured. For the theory 
to be consistent it is necessary that, when the system is in a state 
such that a measurement of é certainly gives the result ¢’, a measure- 
ment of the real and pure imaginary parts of f(£) shall certainly give 
for results the real and pure imaginary parts of f(é’). In the case when 
f(€) is expressible as a power series 


F(E) = cytes E+, 8? +058 +...; 

the c’s being numbers, this condition can again be verified by elemen- 
tary algebra. In the case of more general functions f it may not be 
possible to verify the condition. The condition may then be used to | 
define f(é), which we have not yet defined mathematically. In this 
way we can get a more general definition of a function of an observ- 
able than is provided by power series. 

We define f(é) in general to be that linear operator which satisfies 


FONED = HENED (34) 


This equation leads to 
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for every eigenket |é’> of &, f(é’) being a number for each eigenvalue é’. 
It is easily seen that this definition is self-consistent when applied to_ 
eigenkets |£’> that are not independent. If we have an eigenket |¢’A> 
dependent on other eigenkets of €, these other eigenkets must all 
belong to the same eigenvalue €’, otherwise we should have an equa- 
tion of the type (31), which we have seen is impossible. On multiplying 
the equation which expresses |é’A) linearly in terms of the other 
eigenkets of é by f(é) on the left, we merely multiply each term in it 
by the number f(€’), so we obviously get a consistent equation. 
Further, equation (34) is sufficient to define the linear operator f(£) 
completely, since to get the result of f(€) multiplied into an arbitrary 
ket |P>, we have only to expand |P) in the form of the right-hand 
side of (25) and take 


HEIP> = [HEN e> de’ + F HEN Ed. (35) 


The conjugate complex f(é) of f(¢) is defined by the conjugate 
imaginary equation to (34), namely 
<E1fG) = FENE'L, 
holding for any eigenbra <é¢’|, f(¢’) being the conjugate complex 
function to f(é’). Let us replace é’ here by é” and multiply the 


equation on the right by the arbitrary ket |P>. Then we get, using 
the expansion (25) for |P), 


<E"f@)IP> = FE)E"|P> 
= [Fever le'o ae +S Flenerieray 


= [ Fe e"e'cy de’ +fler)<erie"dy (86) 


wit the help of the orthogonality theorem, <¢”|¢”d)> being uhder- 
stood to be zero if €” is not one of the eigenvalues to which the terms 
in the sum in (25) refer. Again, putting the conjugate complex 
function Fé’) for f(é’) in (35) and multiplying on the left by <é"|, 
we get 


ENFEIP> = [ FEE Ee> de +Fler)cerle"ay. 


The right-hand side here equals that of ou since the —— 
vanish for Y + &", and hence 


<E"fOIP> = <E"FOIP>. 
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This holds for <é”| any eigenbra and |P> any ket, so 


f€) = fle). (37) 
Thus the conjugate complex of the linear operator f(&) is the conjugate 
complex function f of é. 

It follows as a corollary that if f(¢’) is a real function of &’, f(€) is 
a real linear operator. f(€) is then also an observable, since its 
eigenstates form a complete set, every eigenstate of ¢ being also an 
eigenstate of f(€). 

With the above definition we are able to give a meaning to any 
function f of an observable, provided only that the domain of existence 
of the function of a real variable f(x) includes all the eigenvalues of the 
observable. If the domain of existence contains other points besides 
these eigenvalues, then the values of f(x) for these other points will 
not affect the function of the observable. The function need not be 
analytic or continuous. The eigenvalues of a function f of an observ- 
able are just the function f of the eigenvalues of the observable. 

It is important to observe that the possibility of defining a function 
f of an observable requires the existence of a unique number f(x) for 
each value of x which is an eigenvalue of the observable. Thus the 
function f(x) must be single-valued. This may be illustrated by con- 
sidering the question: When we have an observable f(A) which is a 
real function of the observable A, is the observable A a function of 
the observable f(A)? The answer to this is yes, if different eigenvalues 
A’ of A always lead to different values of f(A’). If, however, there 
exist two different eigenvalues of A, A’ and A” say, such that 
f(A’) =f(A"), then, corresponding to the eigenvalue f(A’) of the 
observable f(A), there will not be a unique eigenvalue of the observ- 
able A and the latter will not be a function of the observable f(A). 

It may easily be verified mathematically, from the definition, that 
the sum or product of two functions of an observable is a function 
of that observable and that a function of a function of an observable 
is a function of that observable. Also it is easily seen that the whole 
theory of functions of an observable is symmetrical between bras and 
kets and that we could equally well work from the equation 

<E1F(E) = FEVE | (38) 
instead of from (34). 

We shall conclude this section with a discussion of two examples 

which are of great practical importance, namely the reciprocal and 
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the square root. The reciprocal of an observable exists if the observ- 
able does not have the eigenvalue zero. If the observable « does not 
have the eigenvalue zero, the reciprocal observable, which we call «-1 


or 1/a, will satisfy onl exo”, (39) 
where |«’> is an eigenket of « belonging to the eigenvalue «’. Hence 
aa lag sean | 0’ Sila. 

Since this holds for any eigenket |«’>, we must have 
yee M5 (40) 
Similarly, ooo |. (41) 


Hither of these equations is sufficient to determine a! completely, 
provided « does not have the eigenvalue zero. To prove this in the 
case of (40), let x be any linear operator satisfying the equation 


ar = 1 
and multiply both sides on the left by the «-1 defined by (39). The 


result i 
ult is alee = aot 


and hence from (41) ee. 

Equations (40) and (41) can be used to define the reciprocal, when 
it exists, of a general linear operator «, which need not even be real. 
One of these equations by itself is then not necessarily sufficient. If 
any two linear operators « and f have reciprocals, their product aB 
has the reciprocal (#8)! = B-20-2, (42) 
obtained by taking the reciprocal of each factor and reversing their 
order. We verify (42) by noting that its right-hand side gives unity 
when multiplied by af, either on the right or on the left. This reci- 


procal law for products can be immediately extended to more than 


two factors, i.e., (ceBy...)-t = wy ABtat, 


The square root of an observable « always exists, and is real if « 
has no negative eigenvalues. We write it va or at. It satisfies 


Va|a’> = +Va'lo’>, (43) 
'x’> being an eigenket of « belonging to the eigenvalue a’. Hence 
VowWlox|oc’> = Vex" fox” = a’ |a’> = ala’>, 
and since this holds for any eigenket |x’) we must have 


Navn = ox. (44) 
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On account of the ambiguity of sign in (43) there will be several 
square roots. To fix one of them we must specify a particular sign 
in (43) for each eigenvalue. This sign may vary irregularly from one 
eigenvalue to the next and equation (43) will always define a linear 
operator va satisfying (44) and forming a square-root function of «. 
If there is an eigenvalue of a with two or more independent eigenkets 
belonging to it, then we must, according to our definition of a func- 
tion, have the same sign in (43) for each of these eigenkets. If we 
took different signs, however, equation (44) would still hold, and hence 
equation (44) by itself is not sufficient to define v«, except in the 
special case when there is only one independent eigenket of « belong- 
ing to any eigenvalue. 

The number of different square roots of an observable is 2", where 
n is the total number of eigenvalues not zero. In practice the square- 
root function is used only for observables without negative eigen- 
values and the particular square root that is useful is the one for 
which the positive sign is always taken in (43). This one will be called 
the positive square root. 


12. The general physical interpretation 

The assumptions that we made at the beginning of § 10 to get a 
physical interpretation of the mathematical theory are of a rather 
special kind, since they can be used only in connexion with eigen- 
states. We need some more general assumption which will enable us 
to extract physical information from the mathematics even when we 
are not dealing with eigenstates. 

In classical mechanics an observable always, as we say, ‘has a 
value’ for any particular state of the system. What is there in quan- 
tum mechanics corresponding to this? If we take any observable & 
and any two states x and y, corresponding to the vectors <x| and |y), 
then we can form the number <z|é|y>. This number is not very 
closely analogous to the value which an observable can ‘have’ in the 
classical theory, for three reasons, namely, (i) it refers to two states 
of the system, while the classical value always refers to one, (ii) it is 
in general not a real number, and (iii) it is not uniquely determined 
by the observable and the states, since the vectors <x| and |y> contain 
arbitrary numerical factors. Even if we impose on <x| and |y> the 
condition that they shall be normalized, there will still be an undeter- 
mined factor of modulus unity in <x|£|y>. These three reasons cease 
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to apply, however, if we take the two states to be identical and |y> 
to be the conjugate imaginary vector to <z|. The number that we 
then get, namely <a|é|x>, is necessarily real, and also it is uniquely 
determined when <z| is normalized, since if we multiply <x| by the 
numerical factor e”, c being some real number, we must multiply 
\z> by e-* and <x|é|x> will be unaltered. 

One might thus be inclined to make the tentative assumption that 
the observable & ‘has the value’ <|&|x> for the state x, in a sense 
analogous to the classical sense. This would not be satisfactory, 
though, for the following reason. Let us take a second observable 7, 
which would have by the above assumption the value <x|y|z> for 
this same state. We should then expect, from classical analogy, that 
for this state the sum of the two observables would have a value 
equal to the sum of the values of the two observables separately and 
the product of the two observables would have a value equal to the 
product of the values of the two observables separately. Actually, the 
tentative-assumption would give for the sum of the two observables 
the value <x|€+7|x>, which is, in fact, equal to the sum of <2|&|2> 
and <a|n|x>, but for the product it would give the value <x|&n|z> 
or <x|né|x>, neither of which is connected in any simple way with 
<x|§|x> and <x|n|x>. , 

However, since things go wrong only with the product and not with 
the sum, it would be reasonable to call <x|é|x> the average value of 
the observable £ for the state x. This is because the average of the 
sum of two quantities must equal the sum of their averages, but the 
average of their product need not equal the product of their averages. 
We therefore make the general assumption that if the measurement 
of the observable € for the system in the state corresponding to |x) is 
made a large number of times, the average of all the results obtained will 
be <x|é|a>, provided |x> is normalized. If |x) is not normalized, as is 
necessarily the case if the state x is an eigenstate of some observable 
belonging to an eigenvalue in a range, the assumption becomes that 
the average result of a measurement of £ is proportional to <x|£|x). 
This general assumption provides a basis for a general physical inter- 
pretation of the theory. 

The expression that an observable ‘has a particular value’ for a 
particular state is permissible in quantum mechanics in the special 
case when a measurement of the observable is certain to lead to the 
particular value, so that the state is an eigenstate of the observable. 
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It may easily be verified from the algebra that, with this restricted 
meaning for an observable ‘having a value’, if two observables have 
values for a particular state, then for this state the sum of the two 
observables (if this sum is an observablet) has a value equal to the 
sum of the values of the two observables separately and the product 
of the two observables (if this product is an observable{) has a value 
equal to the product of the values of the two observables separately. 

In the general case we cannot speak of an observable having a value 
for a particular state, but we can speak of its having an average value 
for the state. We can go further and speak of the probability of its 
having any specified value for the state, meaning the probability of 
this specified value being obtained when one makes a measurement of 
the observable. This probability can be obtained from the eral 
assumption in the following way. 

Let the observable be é and let the state correspond to the normal- 
ized ket |z>. Then the general assumption tells us, not only that the 
average value of & is <x|é|x>, but also that the average value of any 
function of £, f(£) say, is <a|f(&)|z>. Take f(é) to be that function of € 
which is equal to unity when ¢ = a, a being some real number, and 
zero otherwise. This function of € has a meaning according to our 
general theory of functions of an observable, and it may be denoted 
by 8,, in conformity with the general notation of the symbol 6 with 
two suffixes given on p. 62 (equation (17)). The average value of 
this function of ¢ is just the probability, P, say, of € having the value 


> P, = (a|8 ele. (45) 


If a is not an eigenvalue of £, 8,, multiplied into any eigenket of ¢ is 
zero, and hence 8, = 0 and P, = 0. This agrees with a conclusion 
of § 10, that any result of a measurement of an observable must be 
one of its eigenvalues. 

If the possible results of a measurement of ¢ form a range of num- 
bers, the probability of £ having exactly a particular value will be 
zero in most physical problems. The quantity of physical importance 
is then the probability of ¢ having a value within a small range, say 
from a to a+da. This probability, which we may call P(a) da, is 

+ This is not obv iously so, since the sum may not have sufficient eigenstates to 


form a complete set, in which case the sum, considered as a single quantity, would 


not be measurable. 
¢ Here the reality condition may fail, as well as the condition for the eigenstates 


to form a complete set. 
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equal to the average value of that function of € which is equal to 
unity for ¢ lying within the range a to a+da and zero otherwise. 
This function of § has a meaning according to our general theory of 


functions of an observable. Denoting it by x(£), we have 


P(a) da = <a\|x(€)|x>. (46) 
If the range a to a+da does not include any eigenvalues of £, we 
have as above y(£) = 0.and P(a) = 0. If |x is not normalized, the 
right-hand sides of (45) and (46) will still be proportional to the 
probability of € having the value a and lying within the range a to 
a+da respectively. 

The assumption of § 10, that a measurement of € is certain to give 
the result £’ if the system is in an eigenstate of € belonging to the 
eigenvalue é’, is consistent with the general assumption for physical 
interpretation and can in fact be deduced from it. Working from the 
general assumption we see that, if |é’> is an eigenket of € belonging 
to the eigenvalue é’, then, in the case of discrete eigenvalues of €, 

dea l€> = 0 unless a= €’, 
and in the case of a range of eigenvalues of & 
x(6)|é = 0 unless the range a to a+da includes €’. 
In either case, for the state corresponding to |é’>, the probability of 
€ having any value other than €’ is zero. 

An eigenstate of € belonging to an eigenvalue é’ lying in a range 
is a state which cannot strictly be realized in practice, since it would 
need an infinite amount of precision to get € to equal exactly é’. 
The most that could be attained in practice would be to get € to lie 
within a narrow range about the value ¢’. The system would then 
be in a state approximating to an eigenstate of €. Thus an eigenstate 
belonging to an eigenvalue in a range is a mathematical idealization 
of what can be attained in practice. All the same such eigenstates 
play a very useful role in the theory and one could not very well do 
without them. Science contains many examples of theoretical con- 
cepts which are limits of things met with in practice and are useful 
for the precise formulation of laws of nature, although they are not 
realizable experimentally, and this is just one more of them. It may 
be that the infinite length of the ket vectors corresponding to these 
eigenstates is connected with their unrealizability, and that all realiz- 
able states correspond to ket vectors that can be normalized and that 
form a Hilbert space. 
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13. Commutability and compatibility 

A state may be simultaneously an eigenstate of two observables. 
If the state corresponds to the ket vector |A)> and the observables are 
€ and 7, we should then have the equations 


f/A>= £'|A, 
n|A> = 7'|A), 


where ¢’ and 1’ are eigenvalues of € and 7 respectively. We can now 
deduce 


En|A> = &n'|A> = €'9'|A> = E'n|AD = n€'|A> = nf 1A), 

or (En—7§)|A> = 0. 
This suggests that the chances for the existence of a simultaneous 
eigenstate are most favourable if £y—7f = 0 and the two observables 
commute. If they do not commute a simultaneous eigenstate is not 
impossible, but is rather exceptional. On the other hand, if they do 
commute there exist so many simultaneous eigenstates that they form a 
complete set, as will now be proved. 

Let £ and 7 be two commuting observables. Take an eigenket of - 
7, |n’> say, belonging to the eigenvalue 7’, and expand it in terms 
of eigenkets of é in the form of the right-hand side of (25), thus 


In'> = [ lé’n’o> dé + & len’. (47) 


The eigenkets of € on the right-hand side here have 7’ inserted in 
them as an extra label, in order to remind us that they come from 
the expansion of a special ket vector, namely |7’>, and not a general 
one as in equation (25). We can now show that each of these eigen- 
kets of ¢ is also an eigenket of 7 belonging to the eigenvalue 7’. Mie 
have 


0 = (n—a)in’> = f r—a)E'1'o> dé +L (n—a)era'd>. (48) 
Now the ket (7—17’)|&7’d> satisfies 
E(n—n')\Ern'd> = (n—7/)El&'d> = (n— 7°)" |bn'd> 
= &(n—7')|6'n'd>, 
showing that it is an eigenket of ¢ belonging to the eigenvalue €’, 
and similarly the ket (y—7’)\£'y’c> is an eigenket of € belonging to 


the eigenvalue ¢’. Equation (48) thus gives an integral plus a sum 
of eigenkets of € equal to zero. which, as we have seen with equation 
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(31), is impossible unless the integrand and every term in the sum 
vanishes. Hence ; 
(n—7)|g'n'c) = 0, — (n—7')|&n'd> = 0, 

so that all the kets appearing on the right-hand side of (47) are 
eigenkets of 7 as well as of £. Equation (47) now gives |n’> expanded 
in terms of simultaneous eigenkets of £ and 7. Since any ket can be 
expanded in terms of eigenkets |7’> of 7, it follows that any ket can 
be expanded in terms of simultaneous eigenkets of ¢ and 7, and thus 
the simultaneous eigenstates form a complete set. 

The above simultaneous eigenkets of ¢ and 7, | ‘n’c> and |é"n’d), 
are labelled by the eigenvalues ¢’ and 7’, or é” and 9’, to which they 
belong, together with the labels c and d which may also be necessary. 
The procedure of using eigenvalues as labels for simultaneous eigen- 
vectors will be generally followed in the future, just as it has been 
followed in the past for eigenvectors of single observables. 

The converse to the above theorem says that, if £ and n are two 
observables such that their simultaneous eigenstates form a complete set, 
then € and » commute. To prove this, we note that, if |£’n’> is a 
simultaneous eigenket belonging to the eigenvalues ¢’ and "'s 

. (Eq—n€)|E'n"> = (E'n'—7'E')|é'n'> = 0. (49) 
Since the simultaneous eigenstates form a complete set, an arbitrary 
ket |P> can be expanded in terms of simultaneous eigenkets eae 
for each of which (49) holds, and hence 

(En—7§)|P> = 0 
and so En—n& = 0. 

The idea of simultaneous eigenstates may be extended to more 
than two observables and the above theorem and its converse still 
hold, i.e. if any set of observables commute, each with all the others, 
their simultaneous eigenstates form a complete set, and conversely. 
The same arguments used for the proof with two observables are 
adequate for the general case; e.g., if we have three commuting 
observables €, », ¢, we can expand any simultaneous eigenket of € 
and 7 in terms of eigenkets of £ and then show that each of these 
eigenkets of { is also an eigenket of é and of y. Thus the simultaneous 
eigenket of £ and 7 is expanded in terms of simultaneous eigenkets 
of €, y, and ¢, and since any ket can be expanded in terms of simul- 
taneous eigenkets of £ and 7, it can also be expanded in terms of 
simultaneous eigenkets of £, n, and ¢. 
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The orthogonality theorem applied to simultaneous eigenkets tells 
us that two simultaneous eigenvectors of a set of commuting observ- 
ables are orthogonal if the sets of eigenvalues to which they belong 
differ in any way. 

Owing to the simultaneous eigenstates of two or more commuting 
observables forming a complete set, we can set up a theory of func- 
tions of two or more commuting observables on the same lines as the 
theory of functions of a single observable given in § 11. If €, n, ¢,... 
are commuting observables, we define a general function f of them 
to be that linear operator f(é, y, ¢,...) which satisfies 


Rea Dien. Safes 7's 0’... Ey e’...>, (50) 
where |é’n’C’...> is any simultaneous eigenket of £, 7, ¢,... belonging 
to the eigenvalues é’,7’,¢’,.... Here f is any function such that 


f(a, 6, ¢,...) is defined for all values of a,b,c,... which are eigenvalues 
of &,7,¢,... respectively. As with a function of a single observable 
defined by (34), we can show that f(é, 7, Z,...) is completely deter- 
mined by (50), that 


f(é, n> 200) = Fé Ys Ea), 


corresponding to (37), and that if f(a,b,c,...) is a real function, 
f(é,7,6,.-.) is real and is an observable. 
We can now proceed to generalize the results (45) and (46). Given 
a set of commuting observables €, 7, f,..., we may form that function 
of them which is equal to unity when = a, n = 6, = ¢,..., a,b, ¢,... 
being real numbers, and is equal to zero when any of these conditions 
is not fulfilled. This function may be written 52,5,,5z,..., and is in 
fact just the product in any order of the factors 5¢,,5,,, 57... defined 
as functions of single observables, as may be seen by substituting this 
product for f(é,7,¢,...) in the left-hand side of (50). The average 
value of this function for any state is the probability, P,.. say, of 
f,n, ¢,.-. having the values a, b,c,... respectively for that state. Thus 
if the state corresponds to the normalized ket vector |x), we get from 
our general assumption for physical interpretation 
Prge,, = §|8¢q Spb Bze--+|2>- (51) 
Pye, is zeto unless each of the numbers a, b,¢.... is an eigenvalue of 
the corresponding observable. If any of the numbers a, 6,€.... is an 


eigenvalue in a range of eigenvalues of the corresponding observable, 
Py... Will usually again be zero, but in this case we ought to replace 
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the requirement that this observable shall have exactly one value by 
the requirement that it shall have a value lying within a small range, 
which involves replacing one of the 8 factors in (51) by a factor like . 
the x(€) of equation (46). On carrying out such a replacement for 
each of the observables €, y, ¢,..., whose corresponding numerical 
value a, 5, c,... lies in a range of eigenvalues, we shall get a proba- 
bility which does not in general vanish. 

If certain observables commute, there exist states for which they all 
have particular values, in the sense explained at the bottom of p. 46, 
namely the simultaneous eigenstates. Thus one can give a meaning to 
several commuting observables having values at the same time. Further, we 
see from (51) that for any state one can give a meaning to the probability 
of particular results being obtained for simultaneous measurements of 
several commuting observables. This conclusion is an important new 
development. In general one cannot make an observation on a 
system in a definite state without disturbing that state and spoiling 
it for the purposes of a second observation. One cannot then give 
any meaning to the two observations being made simultaneously. 
The above conclusion tells us, though, that in the special case when 
the two observables commute, the observations are to be considered 
as non-interfering or compatible, in such a way that one can give a 
meaning to the two observations being made simultaneously and can 
discuss the probability of any particular results being obtained. The 
two observations may, in fact, be considered as a single observation 
of a more complicated type, the result of which is expressible by two 
numbers instead of a single number. From the point of view of generat 
theory, any two or more commuting observables may be counted as a 
single observable, the result of a measurement of which consists of two or 
more numbers. 'The states for which this measurement is certain to 
lead to one particular result are the simultaneous eigenstates. 


Il 
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14. Basic vectors 

In the preceding chapters we set up an algebraic scheme involving 
certain abstract quantities of three kinds, namely bra vectors, ket 
vectors, and linear operators, and we expressed some of the funda- 
mental laws of quantum mechanics in terms of them. It would be 
possible to continue to develop the theory in terms of these abstract 
quantities and to use them for applications to particular problems. 
However, for some purposes it is more convenient to replace the 
abstract quantities by sets of numbers with analogous mathematical 
properties and to work in terms of these sets of numbers. The proce- 
dure is similar to using coordinates in geometry, and has the advan- 
tage of giving one greater mathematical power for the solving of 
particular problems. 

The way in which the abstract quantities are to be ape by 
numbers is not unique, there being many possible ways corresponding 
to the many systems of coordinates one can have in geometry. Each 
of these ways is called a representation and the set of numbers that 
replace an abstract quantity is called the representative of that 
abstract quantity in the representation. Thus the representative of 
an abstract quantity corresponds to the coordinates of a geometrical 
object. When one has a particular problem to work out in quantum 
mechanics, one can minimize the labour by using a representation 
in which the representatives of the more important abstract quanti- 
ties occurring in that problem are as simple as possible. 

To set up a representation in a general way, we take a complete 
set of bra vectors, i.e. a set such that any bra can be expressed 
linearly in terms of them (as a sum or an integral or possibly an 
integral plus a sum). These bras we call the basic bras of the repre- 
sentation. They are sufficient, as we shall see, to fix the representation 
completely. 

Take any ket |2> and form its scalar product with each of the basic 
bras. The numbers so obtained constitute the representative of |a). 
They are sufficient to determine the ket |a> completely, since if there 
is a second ket, |a,> say, for which these numbers are the same, the 


difference |a)>—|a,> will have its scalar product with any basic bra 
3595.67 E 
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vanishing, and hence its scalar product with any bra whatever will 
vanish and |a>— |a,> itself will vanish. 

We may suppose the basic bras to be labelled by one or more 
parameters, A,, Ag,...,A,,, each of which may take on certain numerical 
values. The basic bras will then be written <A, A ...A,,| and the repre- 
sentative of |a> will be written <A, A,...,,\a>. This representative will 
now consist of a set of numbers, one for each set of values that 
ADA, --o Ay may have in their respective domains. Such a set of 
numbers just forms a function of the variables ),, A,,...,A,,. Thus the 
representative of a ket may be looked upon either as a set of numbers 
or as a function of the variables used to label the basic bras. 

If the number of independent states of our dynamical system is 
finite, equal to n say, it is sufficient to take » basic bras, which may 
be labelled by a single parameter A taking on the values 1, 2, 3,..., 7. 
The representative of any ket |a> now consists of the set of n numbers 
<1ja>, <2|a>, <3]a>,..., <n|a>, which are precisely the coordinates of 
the vector |a> referred to a system of coordinates in the usual way. 
The idea of the representative of a ket vector is just a generalization 
of the idea of the coordinates of an ordinary vector and reduces to 
the latter when the number of dimensions of the space of the ket 
vectors is finite. 

In a general representation there is no need for the basic bras to 
be all independent. In most representations used in practice, how- 
ever, they are all independent, and also satisfy the more stringent 
condition that any two of them are orthogonal. The representation 
is then called an orthogonal representation. 

Take an orthogonal representation with basic bras <A, Ap...A,,|, 
labelled by parameters X,, do,...,A,, Whose domains are all real. Take 
@ ket |a> and form its representative <A, A,...A,,/a>. Now form the 
numbers A,¢A, A...A,,|@> and consider them as the representative of 
a new ket |b>. This is permissible since the numbers forming the 
representative of a ket are independent, on account of the basic bras 
being independent. The ket |b) is defined by the equation 


KAq Ag...A,,|5> — A<Az Ag...A,,|@>. 
The ket |b> is evidently a linear function of the ket |a), so it may 


be considered as the result of a linear operator applied to |a). Calling 
this linear operator L,, we have 


(b> = L, |a> 
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and hence KN Ap AE ACA Ag. 2.Al, (a>. 
This equation holds for any ket |a), so we get 
CN asarheg | eer, CA, Ags..A,, |- (1) 


Equation (1) may be looked upon as the definition of the linear 
operator L,. It shows that each basic bra is an eigenbra of L,, the 
value of the parameter A, being the eigenvalue belonging to it. 

From the condition that the basic bras are onehogonsl we can 
deduce that L, is real and is an observable. Let j,Aj,...,A,, and 
Nj Ags---: A, be two sets of values for the parameters A,,Ap9,...,A,,- 
We have, putting \’’s for the 4’s in (1) and multiplying on the right 
by |AjA3...AZ>, the conjugate imaginary of the basic bra <AjA9...Az|, 

CQ AA AGe AD = NOyde Alay AgeA>. 
Interchanging X’’s and ”’s, 

RMAC alge, OMEN. NNN. 
On account of the basic bras being orthogonal, the right-hand sides 
here vanish unless Aj = X, for all r from 1 to uw, in which case the 
right-hand sides are equal, and they are also real, A, being real. Thus, 
whether the \”’s are equal to the A’’s or not, 

Ny Ages] Ly [Ag AQeALY = CAT AQ. Ay Ly |g Ag. A> 

= dy. A TAIMA.-AD 

from equation (4) of § 8. Since the <A, Aj...41,|’s form a complete set 
of bras and the |AjAj...A,>’s form a complete set of kets, we can 
infer that L, — L,. The further condition required for L, to be an 
observable, namely that its eigenstates shall form a complete set, is 
obviously satisfied since it has as eigenbras the basic bras, which 
form a complete set. 

We can similarly introduce linear operators Ly, L,..., L,, by multi- 
plying <A, A,...A,,|@> by the factors Ag, Ag,...,A,, in turn and considering 
the resulting sets of numbers as representatives of kets. Each of these 
L’s can be shown in the same way to have the basic bras as eigenbras 
and to be real and an observable. The basic bras are simultaneous 
eigenbras of all the L’s. Since these simultaneous eigenbras form a 
complete set, it follows from a theorem of § 13 that any two of the 
L’s commute. 

It will now be shown that, if £,,é,,...,€, are any set of commuting 
observables, we can set wp an orthogonal representation in which the basic 
bras are simultaneous eigenbras of €,, &2,...,€,,- Let us suppose first that 


56 REPRESENTATIONS § 14 


there is only one independent simultaneous eigenbra of €,, &,...,&, 
belonging to any set of eigenvalues &;, &,...,&,. Then we may take 
these simultaneous eigenbras, with arbitrary numerical coefficients, as 
our basic bras. They are all orthogonal on account of the orthogonality 
theorem (any two of them will have at least one eigenvalue different, 
which is sufficient to make them orthogonal) and there are sufficient 
of them to form a complete set, from a result of §13. They may 
conveniently be labelled by the eigenvalues &/, &,..., €, to which they 
belong, so that one of them is written <€; &...€;|- 

Passing now to the general case when there are several independent 
simultaneous eigenbras of &,, &,,..., €,, belonging to some sets of eigen- 
values, we must pick out from all the simultaneous eigenbras belong- 
ing to a set of eigenvalues €}, &,..., €, a complete subset, the members 
of which are all orthogonal to one another. (The condition of com- 
pleteness here means that any simultaneous eigenbra belonging to the 
eigenvalues £1, ,...,&, can be expressed linearly in terms of the 
members of the subset.) We must do this for each set of eigenvalues 
&, &,---» &, and then put all the members of all the subsets together 
and take them as the basic bras of the representation. These bras 
are all orthogonal, two of them being orthogonal from the orthogona- 
lity theorem if they belong to different sets of eigenvalues and from 
the special way in which they were chosen if they belong to the same 
set of eigenvalues, and they form altogether a complete set of bras, 
as any bra can be expressed linearly in terms of simultaneous eigen- 
bras and each simultaneous eigenbra can then be expressed linearly 
in terms of the members of a subset. . There are infinitely many ways 
of choosing the subsets, and each way provides one orthogonal 
representation. 

For labelling the basic bras in this general case, we may use the 
eigenvalues ¢, &,...,€, to which they belong, together with certain 
additional real variables Aj, Ag,...,A,, say, which must be introduced to 
distinguish basic vectors belonging to the same set of eigenvalues 
from one another. A basic bra is then written <€) &...€/,A,Ag--Ay|. 
Corresponding to the variables A,,A,,...,A, we. can define linear 
operators L,, L,,...,L,, by equations like (1) and can show that these 
linear operators have the basic bras as eigenbras, and that they are 
real and observables, and that they commute with one another and 
with the és. The basic bras are now simultaneous eigenbras of all 
the commuting observables &,, &,...,€,, Z,, Lg,..., Ly. 
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Let us define a complete set of commuting observables to be a set of 
observables which all commute with one another and for which there 
is only one simultaneous eigenstate belonging to any set of eigen- 
values. Then the observables &,, é,,...,£,,, L,, L,,..., L, form a complete 
set of commuting observables, there being only one independent simul- 
taneous eigenbra belonging to the eigenvalues &, &,..., £4; Az, Ags-++» Ays 
namely the corresponding basic bra. Similarly the observables 
L,, L,,..., L,, defined by equation (1) and the following work form 
a complete set of commuting observables. With the help of this 
definition the main results of the present section can be concisely 
formulated thus: 

(i) The basic bras of an orthogonal representation are simul- 
taneous eigenbras of a complete set of commuting observ- 
ables. 

(ii) Given a complete set of commuting observables, we can set 
up an orthogonal representation in which the basic bras are 
simultaneous eigenbras of this complete set. 

(iii) Any set of commuting observables can be made into a com- 
plete commuting set by adding certain observables to it. 

(iv) A convenient way of labelling the basic bras of an orthogonal 
representation is by means of the eigenvalues of the complete 
set of commuting observables of which the basic bras are 
simultaneous eigenbras. 

The conjugate imaginaries of the basic bras of a representation we 
call the basic kets of the representation. Thus, if the basic bras are 
denoted by <A, A,...A,,|, the basic kets will be denoted by |A, A...A,). 
The representative of a bra <b| is given by its scalar product with 
each of the basic kets, ie. by <b|A,A9...A,>. It may, like the repre- 
sentative of a ket, be looked upon either as a set of numbers or as a 
function of the variables ),,A,,...,A,,. We have 


CBiNyAgeAy> = Ay Ag..Ay 6), 


showing that the representative of a bra is the conjugate complex of the 
representative of the conjugate imaginary ket. In an orthogonal repre- 
sentation, where the basic bras are simultaneous eigenbras of a com- 
plete set of commuting observables, &,, 2,..., 5, SaY; the basic kets 
will be simultaneous eigenkets of €,, &2,.--, Ex: 

We have not yet considered the lengths of the basic vectors. With 
an orthogonal representation, the natural thing to do is to normalize 
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the basic vectors, rather than leave their lengths arbitrary, and so 
introduce a further stage of simplification into the representation. 
However, it is possible to normalize them only if the parameters 
which label them all take on discrete values. If any of these para- 
meters are continuous variables that can take on all values in a range, 
the basic vectors are eigenvectors of some observable belonging to 
eigenvalues in a range and are of infinite length, from the discussion 
in § 10 (see p. 39 and top of p. 40). Some other procedure is then 
needed to fix the numerical factors by which the basic vectors may 
be multiplied. To get a convenient method of handling this question 
anew mathematical notation is required, which will be given in the 
next section. 


15. The 8 function 

Our work in § 10 led us to consider quantities involving a certain 
kind of infinity. To get a precise notation for dealing with these 
infinities, we introduce a quantity 8(x) depending on a parameter x 
satisfying the conditions 


ee ag = 1 (2) 


d(z) = 0 for x + 0, 


To get a picture of 8(x), take a function of the real variable x which 
vanishes everywhere except inside a small domain, of length « say, 
surrounding the origin = 0, and which is so large inside this domain 
that its integral over this domain is unity. The exact shape of the 
function inside this domain does not matter, provided there are no 
unnecessarily wild variations (for example provided the function 
is always of order «~!). Then in the limit ¢ > 0 this function will go 
over into d(x). as 

6(x) is not a function of x according to the usual mathematical 
definition of a function, which requires a function to have a definite 
value for each point in its domain, but is something more general, 
which we may call an ‘improper function’ to show up its difference 
from a function defined by the usual definition. Thus (x) is not a 
quantity which can be generally used in mathematical analysis like 
an ordinary function, but its use must be confined to certain simple 
types of expression for which it is obvious that no inconsistency 
can arise. 
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The most important property of 3(x) is exemplified by the follow- 
ing equation, oo 
| f@)8(e) de = f(0), (3) 


where f(x) is any continuous function of x. We can easily see the 
validity of this equation from the above picture of 5(x). The left- 
hand side of (3) can depend only on the values of f(x) very close 
to the origin, so that we may replace f(x) by its value at the origin, 
f(0), without essential error. Equation (3) then follows from the 
first of equations (2). By making a change of origin in (3), we can 
deduce the formula , 


| £@)8(@—a) de = fla), (4) 


where a is any real number. Thus the process of multiplying a function 
of x by 5(x—a) and integrating over all x is equivalent to the process of 
substituting a for x. This general result holds also if the function of z is 
not a numerical one, but is a vector or linear operator depending on x. 

The range of integration in (3) and (4) need not be from —oo tooo, 
but may be over any domain surrounding the critical point at which 
the 5 function does not vanish. In future the limits of integration 
will usually be omitted in such equations, it being understood that 
the domain of integration is a suitable one. 

Equations (3) and (4) show that, although an improper function 
does not itself have a well-defined value, when it occurs as a factor 
in an integrand the integral has a well-defined value. In quantum 
theory, whenever an improper function appears, it will be something 
which is to be used ultimately in an integrand. Therefore it should be 
possible to rewrite the theory in a form in which the improper func- 
tions appear all through only in integrands. One could then eliminate 
the improper functions altogether. The use of improper functions 
thus does not involve any lack of rigour in the theory, but is merely 
@ convenient notation, enabling us to express in a concise form 
certain relations which we could, if necessary, rewrite in a form not 
involving improper functions, but only in a cumbersome way which 
would tend to obscure the argument. 

An alternative way of defining the 5 function is as the differential 
coefficient ¢’(x) of the function ¢(x) given by 

ez) =0 (4 <0) 6) 
1 > @): 
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We may verify that this is equivalent to the previous definition by 
substituting e’(x) for 5(x) in the left-hand side of (3) and integrating 
by parts. We find, for g, and g, two positive numbers, 


[Feede'(a) de = ferelay,— fF Peele) ae 


=flg.)— | fe) de 
= #(0), 


in agreement with (3). The 5 function appears whenever one differen- 
tiates a discontinuous function. 

There are a number of elementary equations which one can write 
down about 6 functions. These equations are essentially rules of 
manipulation for algebraic work involving 5 functions. The meaning 
of any of these equations is that its two sides give equivalent results 
as factors in an integrand. 

Examples of such equations are 


8(—ax) = 8(2) (6) 

Zoe) — 0; (7) 

S(ax) = a-8(xz) (a > 0), (8) 

5(x?—a?) = fa-Y{3(u—a)+8(x+a)} (a> 0), (9) 

J S(a—zx) dx 8(a—b) = 8(a—b), (10) 
f(x)8(a@—a) = f(a)3(z—a). (11) 


Equation (6), which merely states that 8(x) is an even function of its 
variable x is trivial. To verify (7) take any continuous function of 


x, f(x). Then 
i} fiz 20a) dan — 0, 


from (3). Thus x8(x) as a factor in an integrand is equivalertt to 
zero, which is just the meaning of (7). (8) and (9) may be verified 
by similar elementary arguments. To verify (10) take any continuous 
function of a, f(a). Then 


f fla) da | 8(a—a) dx 8(z—b) = / 8(x—b) dx | f(a) da 8(a—a) 
= | 8(x—b) da f(a) = | f(a) da 8(a—b). 


Thus the two sides of (10) are equivalent as factors in an integrand 
with a as variable of integration. It may be shown in the same way 
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that they are equivalent also as factors in an integrand with 6 as 
variable of integration, so that equation (10) is justified from either 
of these points of view. Equation (11) is also easily justified, with 
the help of (4), from two points of view. 

Equation (10) would be given by an application of (4) with 
f(x) = 8(a—b). We have here an illustration of the fact that we may 
often use an improper function as though it were an ordinary con- 
tinuous function, without getting a wrong result. 

Equation (7) shows that, whenever one divides both sides of an 
equation by a variable x which can take on the value zero, one 
should add on to one side an arbitrary multiple of 8(zx), i.e. from an 


equation Wha (12) 
one cannot infer Ae Bie; 
but only A/z = B/x+c8(z), (13) 


where c is unknown. 
As an illustration of work with the 5 function, we may consider the 
differentiation of log x. The usual formula 


loge = = (14) 
requires examination for the neighbourhood of x = 0. In order to 
make the reciprocal function 1/x well defined in the neighbourhood 
of x = 0 (in the sense of an improper function) we must impose on 
it an extra condition, such as that its integral from —e to e« vanishes. 
With this extra condition, the integral of the right-hand side of (14) 
from —e to e vanishes, while that of the left-hand side of (14) equals 
log (—1), so that (14) is not a correct equation. To correct it, we must 
remember that, taking principal values, logx has a pure imaginary 
term im for negative values of z. As x passes through the value zero 
this pure imaginary term vanishes discontinuously. The differen- 
tiation of this pure imaginary term gives us the result —i76(z), so 
that (14) should read 

= logs = : ie). (15) 
The particular combination of reciprocal function and 8 function 
appearing in (15) plays an important part in the quantum theory of 
collision processes (see § 50). 
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16. Properties of the basic vectors 

Using the notation of the 5 function, we can proceed with the theory 
of representations. Let us suppose first that we have a single observ- 
able € forming by itself a complete commuting set, the condition for 
this being that there is only one eigenstate of € belonging to any 
eigenvalue €’, and let us set up an orthogonal representation in which 
the basic vectors are eigenvectors of € and are written <é’|, |&’. 

In the case when the eigenvalues of € are discrete, we can normalize 
the basic vectors, and we then have 


EE > =O (f AE"), 


EE = 1. 
These equations can be combined into the single equation 
<E'1E"> = Seen, (16) 


where the symbol 6 with two suffixes, which we shall often use in the 
future, has the meaning 
5,.= 0 when rs (17) 
=] awien +7. 

In the case when the eigenvalues of € are continuous we cannot 
normalize the basic vectors. If we now consider the quantity cee 
with é’ fixed and £” varying, we see from the work connected with 
expression (29) of § 10 that this quantity vanishes for €” ~ é’ and 
that its integral over a range of é” extending through the value ¢’ 
is finite, equal to c say. Thus 

<E"|E"> = 08(E’—é"). 
From (30) of § 10, c is a positive number. It may vary with ¢’, so 
we should write it c(é’) or c’ for brevity, and thus we have 

<E'lE") = 0 B(E’—€"). (18) 
Alternatively, we have : 

<e"l6"> = 0" 8(f’—£"), (19) 
where c” is short for c(é”), the right-hand sides of (18) and (19) being 
equal on account of (11). 

Let us pass to another representation whose basic vectors are 
eigenvectors of €, the new basic vectors being numerical multiples of 
the previous ones. Calling the new basic vectors <¢’*|, |é’*), with the 
additional label * to distinguish them from the previous ones, we have 


e*l SRE |, |E*) = KE"), 
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where k’ is short for k(’) and is a number depending on é’. We get 
Lea e = Ry = WR’ 3g" —£°) 

with the help of (18). This may be written 

CE*|E"*) = W’k'c' 8(E'—£") 
from (11). By choosing k’ so that its modulus is c’-+, which is possible 
since c’ is positive, we arrange to have 


CER |S" * = O(€' —E"). (20) 
The lengths of the new basic vectors are now fixed so as to make the 
representation as simple as possible. The way these lengths were 
fixed is in some respects analogous to the normalizing of the basic 
vectors in the case of discrete é’, equation (20) being of the form of 
(16) with the 6 function 6(é’—£”) replacing the 8 symbol 8;4- of 
equation (16). We shall continue to work with the new representation 
and shall drop the * labels in it to save writing. Thus (20) will now 


—_— <E'1E"> = ('—£"). a) 


We can develop the theory on closely parallel lines for the discrete 
and continuous cases. For the discrete case we have, using (16), 
> all » IS >8gee = IE"), 
the sum being taken over all eigenvalues. This equation holds for 
any basic ket |€”> and hence, since the basic kets form a complete set, 


2 le ><e"| = 1. (22) 


This is a useful equation expressing an important property of the 
basic vectors, namely, if |é’> is multiplied on the right by <&'| the 
resulting linear operator, summed for all é', equals the unit operator. 
Equations (16) and (22) give the fundamental properties of the basic 
vectors for the discrete case. 

Similarly, for the continuous case we have, using (21), 


J lg> dg" <e'le"> = | 1g ag" 8E'—€") = 16 (23) 


from (4) applied with a ket vector for f(x), the range of integration 
being the range of eigenvalues. This holds for any basic ket |£”> 


and hence 
fra ela. (24) 
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This is of the same form as (22) with an integral replacing the sum. 
Equations (21) and (24) give the fundamental properties of the basic 
vectors for the continuous case. 

Equations (22) and (24) enable one to expand any bra or ket in 
terms of the basic vectors. For example, we get for the ket |P) in the 
discrete case, by multiplying (22) on the right by |P)>, 


[E + I6"><é"|P>, (25) 


which gives |P)> expanded in terms of the |é’>’s and shows that the 
coefficients in the expansion are <&’|P), which are just the numbers 
forming the representative of |P)>. Similarly, in the continuous case, 


|P> = [ 1é> dé’ <e1P>, (26) 


giving |P> as an integral over the |é’>’s, with the coefficient in the 
integrand again just the representative <£’|P) of |P>. The conjugate 
imaginary equations to (25) and (26) would give the bra vector <P| 
expanded in terms of the basic bras. 

Our present mathematical methods enable us in the continuous 
case to expand any ket as an integral of eigenkets of ¢. If we do not 
use the 6 function notation, the expansion of.a general ket will consist 
of an integral plus a sum, as in equation (25) of § 10, but the 5 function 
enables us to replace the sum by an integral in which the integrand 
consists of terms each containing a 8 function as a factor. For 
example, the eigenket |£”> may be replaced by an integral of eigen- 
kets, as is shown by the second of equations (23). 

If <Q| is any bra and |P) any ket we get, by further applications 


me aI = EQIP» (a) 


for discrete ¢’ and 


<QIP> = | <QIé"> dé’ <é'|P *(28) 


for continuous £’. These equations express the scalar product of ¢Q| 
and |P> in terms of their representatives <Q|é> and <é’|P>. Equa- 
tion (27) is just the usual formula for the scalar product of two 
vectors in terms of the coordinates of the vectors, and (28) is the 
natural modification of this formula for the case of continuous Si 
with an integral instead of a sum. 

The generalization of the foregoing work to the case when & has 
both discrete and continuous eigenvalues is quite straightforward. 
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Using £’ and &° to denote discrete eigenvalues and ¢’ and €” to denote 
‘continuous eigenvalues, we have the set of equations 


CES) = Bee, EE D = 0, EE") = 8(E’—-€") (29) 
as the generalization of (16) or (21). These equations express that 
the basic vectors are all orthogonal, that those belonging to discrete 
eigenvalues are normalized and those belonging to continuous eigen- 


values have their lengths fixed by the same rule as led to (20). From 
(29) we can derive, as the generalization of (22) or (24), 


2 e+ [ede el =1, (30) 


the range of integration being the range of continuous eigenvalues. 
With the help of (30), we get immediately 


Py = Bee fle ag" <e"1P> (31) 
as the generalization of (25) or (26), and 
<QIP> = B <QIE EP >+ J <@le>ag’ <e1P> (82) 


as the generalization of (27) or (28). 

Let us now pass to the general case when we have several commuting 
observables ¢,, £,,..., €,, forming a complete commuting set and set up 
an orthogonal representation in which the basic vectors are simul- 
taneous eigenvectors of all of them, and are written CG iki Why BD- 
Let us suppose £),&,....€ (v <u) have discrete eigenvalues and 
Evi €, have continuous eigenvalues. ; 

Consider the quantity <é..£)f)41--£ul&--6oéo41-6.>- From the 
orthogonality theorem, it must vanish unless each te, for 
g=v+l,..,u. By extending the work connected with expression 
(29) of §10 to simultaneous eigenvectors of several commuting 
observables and extending also the axiom (30), we find that the 
(w—v)-fold integral of this quantity with respect to each £5 over 
@ range extending through the value £, is a finite positive number. 
Calling this number c’, the ’ denoting that it is a function of 
Eis for Soares Sys WE Can express OUT results by the equation 


CEL Cosabulerbobrar far = 0 6(Eor1—Fo41)-B(Eu-Eu)» (83) 
with one 8 factor on the right-hand side for each value of s from 
vy+1 tou. We now change the lengths of our basic vectors so as to 
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make c’ unity, by a procedure similar to that which led to (20). By 
a further use of the orthogonality theorem, we get finally 


Ee SuleteSa> = Sg er--Be,47 8 (S41 —Eo41)-8(Eu—-En), (34) 


with a two-suffix 5 symbol on the right-hand side for each é with 
discrete eigenvalues and a 5 function for each € with continuous 
eigenvalues. This is the generalization of (16) or (21) to the case when 
there are several commuting observables in the complete set. 

From (34) we can derive, as the generalization of (22) or (24) 


2 Meee ees dege ree, | — 1, (35) 


the integral being a (u—v)-fold one over all the £”’s with continuous 
eigenvalues and the summation being over all the ¢’’s with discrete 
eigenvalues. Equations (34) and (35) give the fundamental properties 
of the basic vectors in the present case. From (35) we can imme- 
diately write down the generalization of (25) or (26) and of (27 ) or (28). 

The case we have just considered can be further generalized by 
allowing some of the é’s to have both discrete and continuous eigen- 
values. The modifications required in the equations are quite straight- 
forward, but will not be given here as they are rather cumbersome to 
write down in general form. 

There are some problems in which it is convenient not to make the 
c’ of equation (33) equal unity, but to make it equal to some definite 
function of the £’s instead. Calling this function of the E"s p’1 we 
then have, instead of (34) 


Cbaeebulei- Sad = p'* 8g: 02-8250 8(En41—E541)--5(E,—£%), (36) 
and instead of (35) we get 


Me) ie, Coe ane “3 
Pp do] Bde Corll, Goudie (37) 


p’ is called the weight function of the representation, p’d&,,,..d&, 
being the ‘weight’ attached to a small volume element of the space 
of the variables &,,,,..,&,. 

“, The representations we considered previously all had the weight 
function unity. The introduction of a weight function not unity is 
entirely a matter of convenience and does not add anything to the 
mathematical power of the representation. The basic bras Geir c co 
of a representation with the weight function p’ are connected with 
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the basic bras <&...¢,| of the corresponding representation with the 
weight function unity by 

Ene Su®] = pK Es. Sul, (38) 
as is easily verified. An example of a useful representation with 
non-unit weight function occurs when one has two é¢’s which are 
the polar and azimuthal angles @ and ¢ giving a direction in three- 
dimensional space and one takes p’ = sin 6’. One then has the element 
of solid angle sin 6’ d@‘d¢’ occurring in (37). 


17. The representation of linear operators 

In § 14 we saw how to represent ket and bra vectors by sets of 
numbers. We now have to do the same for linear operators, in order 
to have a complete scheme for representing all our abstract quantities 
by sets of numbers. The same basic vectors that we had in s 14 can 
be used again for this purpose. 

Let us suppose the basic vectors are simultaneous eigenvectors of 
a complete set of commuting observables &,, é,...,€,. If « is any 
linear operator, we take a general basic bra <é}...¢,,| and a general 
basic ket |&...€7 and form the numbers 

E+ SulorlEr- Sa) (39) 
These numbers are sufficient to determine a completely, since in the 
first place they determine the ket a|&...€%> (as they provide the 
representative of this ket), and the value of this ket for all the basic 
kets |é]...€7> determines a. The numbers (39) are called the repre- 
sentative of the linear operator « or of the dynamical variable «. They 
are more complicated than the representative of a ket or bra vector 
in that they involve the parameters that label two basic vectors 
instead of one. 

Let us examine the form of these numbers in simple cases. Take 
first the case when there is only one €, forming a complete commuting 
set by itself, and suppose that it has discrete eigenvalues é’. The 
representative of « is then the discrete set of numbers <€’|a|é”>. If 
one had to write out these numbers explicitly, the natural way of 
arranging them would be as a two-dimensional array, thus: 

CEtlaulet> EA laelE*> <P lee E*> 
KE*\axlEt> <EPlalg> <Slale> . 
<Page <lale> <eryale> . . (40) 
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where &1, &?, &,.. are all the eigenvalues of €. Such an array is called 
a matrix and the numbers are called the elements of the matrix. We 
make the convention that the elements must always be arranged so 
that those in the same row refer to the same basic bra vector and 
those in the same column refer to the same basic ket vector. 

An element <é’|«|é’> referring to two basic vectors with the same 
label is called a diagonal element of the matrix, as all such elements 
lie on a diagonal. If we put a equal to unity, we have from (16) all 
the diagonal elements equal to unity and all the other elements equal 
to zero. The matrix is then called the unit matrix. 

If « is real, we have 

Ke" lele"> = <b" |alé">. (41) 
The effect of these conditions on the matrix (40) is to make the 
diagonal elements all real and each of the other elements equal the 
conjugate complex of its mirror reflection in the diagonal. The matrix 
is then called a Hermitian matrix. 

If we put « equal to ¢, we get for a general element of the matrix 

EEE") = FEE" = C' Spe-. (42) 
Thus all the elements not on the diagonal are zero. The matrix is 
then called a diagonal matrix. Its diagonal elements are just ae 
to the eigenvalues of €. More generally, if we put « equal to f(€) 
function of €, we get 
EUFEVED =f) ee : (43) 
and the matrix is again a diagonal matrix. 

Let us determine the representative of a product af of two linear 
operators « and 8 in terms of the representatives of the factors. 
From equation (22) with €” substituted for ¢’ we obtain 


<E'loBle") = Cla B le" ><e"iBle"> 
=F ale >ce sie», (44) 


which gives us the required result. Equation (44) shows that the 
matrix formed by the elements <£’|«f|£”> equals the product of the 
matrices formed by the elements <é’ |a|£”> and <£’ |B|é”> respectively, 
according to the usual mathematical rule for multiplying matrices. 
This rule gives for the element in the rth row and sth column of the 
product matrix the sum of the product of each element in the rth 
row of the first factor matrix with the corresponding element in the sth 


$17 THE REPRESENTATION OF LINEAR OPERATORS 69 


column of the second factor matrix. The multiplication of matrices 
is non-comniutative, like the multiplication of linear operators. 

We can summarize our results for the case when there is only one 
€ and it has discrete eigenvalues as follows: 

(i) Any linear operator is represented by a matrix. 

(ii) The unit operator is represented by the unit matrix. 

(iii) A real linear operator is represented by a Hermitian matrix. 

(iv) € and functions of are represented by diagonal matrices. 

(v) The matrix representing the product of two linear operators is the 

product of the matrices representing the two factors. 

Let us now consider the case when there is only one € and it has 
continuous eigenvalues. The representative of « is now <é’|a|f”), a 
function of two variables é’ and é” which can vary continuously. It 
is convenient to call such a function a ‘matrix’, using this word in 
a generalized sense, in order that we may be able to use the same 
terminology for the discrete and continuous cases. One of these 
generalized matrices cannot, of course, be written out as a two- 
dimensional array like an ordinary matrix, since the number of its 
rows and columns is an infinity equal to the number of points on a 
line, and the number of its elements is an infinity equal to the 
number of points in an area. 

We arrange our definitions concerning these generalized matrices 
so that the rules (i)-(v) which we had above for the discrete case 
hold also for the continuous case. The unit operator is represented 
by 6(é’—&”) and the generalized matrix formed by these elements 
we define to be the unit matrix. We still have equation (41) as the 
condition for « to be real and we define the generalized matrix formed 
by the elements <é’|a|é”> to be Hermitian when it satisfies this 
condition. é is represented by 

<e'lElE"> = £'8(E'—£") (45) 
and f(£) by EF ENED = HE) SE’ —8"), (46) 
and the generalized matrices formed by these elements we define to be 
diagonal matrices. From (11), we could equally well have é” and f(&”) 
as the coefficients of 5(£’— £”) on the right-hand sides of (45) and (46) 
respectively. Corresponding to equation (44) we now have, from (24) 


CE laple> = I <E" |alE"> dg" KE" BIE", (47) 
with an integral instead of a sum, and we define the generalized 


matrix formed by the elements on the right-hand side here to be the 
3595.57 F 
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product of the matrices formed by <&’|a|&”> and <&’|B|E”>. With 
these definitions we secure complete parallelism between the discrete 
and continuous cases and we have the rules (i)-(v) holding for both. 
' The question arises how a general diagonal matrix is to be defined 
in the continuous case, as so far we have only defined the right-hand 
sides of (45) and (46) to be examples of diagonal matrices. One 
might be inclined to define as diagonal any matrix whose (€’, €”) 
elements all vanish except when &’ differs infinitely little from é”, 
but this would not be satisfactory, because an important property 
of diagonal matrices in the discrete case is that they always commute 
with one another and we want this property to hold also in the 
continuous case. In order that the matrix formed by the elements 
<&'|w|€”> in the continuous case may commute with that formed by 
the elements on the right-hand side of (45) we must have, using the 
multiplication rule (47), 
| CS" wg" der B" 8(E" 6") = f g'8(E'— 8") dg” <E" |w |B". 
With the help of formula (4), this reduces to 
CS wlE">o" = SE" |w|E"> (48) 
or (E°—8")E"w |B" = 0. 
This gives, according to the rule by which (13) follows from (12), 
CE" |w|E"> = 0’ 8(f’—£") 
where c’ is a number that may depend on é’. Thus <&’|w|é”> is of the 
form of the right-hand side of (46). For this reason we define only 
matrices whose etements are of the form of the right-hand side of (46) to 
be diagonal matrices. It is easily verified that these matrices all 
commute with one another. One can form other matrices whose 
(’, €”) elements all vanish when €’ differs appreciably from é* and 
have a different form of singularity when €’ equals ¢” [we shall later 
introduce the derivative 5’(x) of the 5 function and 8’(¢’—€”) will 
then be an example, see § 22 equation (19)], but these other matrices 
are not diagonal according to the definition. . 
Let us now pass on to the case when there is only one ¢ and it has 
both discrete and continuous eigenvalues. Using &7,£* to denote 
discrete eigenvalues and €’,£” to denote continuous eigenvalues, we 


now have the representative of « consisting of four kinds of quanti- 
ties, <g"|a/&*>, <E"lalé", <é"|alé", <é’|alé">. These quantities can all 
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be put together and considered to form a more general kind of matrix 
having some discrete rows and columns and also a continuous range 
of rows and columns. We define unit matrix, Hermitian matrix, 
diagonal matrix, and the product of two matrices also for this more 
general kind of matrix so as to make the rules (i}-(v) still hold. The 
details are a straightforward generalization of what has gone before 
and need not be given explicitly. 

Let us now go back to the general case of several ¢’s, Sutbanx. H.. 
The representative of «, expression (39), may still be looked upon as 
forming a matrix, with rows corresponding to different values of 
&,-.€, and columns corresponding to different values of £7...., £7. 
Unless all the é’s have diserete eigenvalues, this matrix will be of the 
generalized kind with continuous ranges of rows and columns. We 
again arrange our definitions so that the rules (i}-(v) hold, with rule 
(iv) generalized to: 

(iv’) Each €,, (m = 1,2....,u) and any function of them is repre- 
sented by a diagonal matriz. 

A diagonal matrix is now defined as one whose general element 
<€;...€,|w\é7...€4> is of the form 


Spe beglley.--Eiy = 6! Bz, 01.82 223(E.4—Enus).8(E,—E2) (49) 


in the ease when ,...,f, have discrete eigenvalues and ¢,,,..., é,, have 
continuous eigenvalues, c’ being any function of the é’’s. This defini- 
tion is the generalization of what we had with one ¢ and makes 
diagonal matrices always commute with one another. The other 
definitions are straightforward and need not be given explicitly. 

We now have a linear operator always represented by a matrix. 
The sum of two linear operators is represented by the sum of the 
matrices representing the operators and this, together with rule (v), 
means that the matrices are subject to the same algebraic relations az 
the linear operators. If any algebraic equation holds between certain 
linear operators, the same equation must hold between the matrices 
representing those operators. 

The scheme of matrices can be extended to bring in the repre- 
sentatives of ket and bra vectors. The matrices representing linear 
operators are all square matrices with the same number of rows and 
columns, and with, in fact, a one-one correspondence between their 
rows and columns. We may look upon the representative of a ket 
|P> as a matrix with a single column by setting all the numbers 
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<é...€,|P) which form this representative one below the other. The 
number of rows in this matrix will be the same as the number of 
rows or columns in the square matrices representing linear operators. 
Such a single-column matrix can be multiplied on the left by a square 
matrix <&...é,|«|£7...€%) representing a linear operator, by a rule 
similar to that for the multiplication of two square matrices. The 
product is another single-column matrix with elements given by 


S, ff <-Palelet.. i> dear dt <E.8UIP. 
fibe 


From (35) this is just equal to <&...€,|a|P>, the representative of 
a|P>. Similarly we may look upon the representative of a bra <Q| 
as a matrix with a single row by setting all the numbers <Q|§...6) 
side by side. Such a single-row matrix may be multiplied on the 
right by a square matrix <€)...€,|a|&...£4, the product being another 
single-row matrix, which is just the representative of <Q|a. The 
single-row matrix representing <Q| may be multiplied on the right 
by the single-column matrix representing |P>, the product being a 
matrix with just a single element, which is equal to <Q|P>. Finally, 
the single-row matrix representing (Q| may be multiplied on the left 
by the singie-column matrix representing |P), the product being a 
square matrix, which is just the representative of |P><Q|. In this 
way all our abstract symbols, linear operators, bra vectors, and ket 
vectors, can be represented by matrices, which are subject to the 
same algebraic relations as the abstract symbols themselves. 


18. Probability amplitudes 

Representations are of great importance in the physical interpreta- 
tion of quantum mechanics as they provide a convenient method for 
obtaining the probabilities of observables having given values. In 
§ 12 we obtained the probability of an observable having any speci- 
fied value for a given state and in § 13 we generalized this result 
and obtained the probability of a set of commuting observables 
simultaneously having specified values for a given state. Let us now 
apply this result to a complete set of commuting observables, say the 
set of €’s which we have been dealing with already. According to 
formula (51) of § 13, the probability of each é, having the value €, 
for the state corresponding to the normalized ket vector |z> is 


Pet, = 182, 4; 84, 6,-+-8¢,8,10>. (50) 
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If the &’s all have discrete eigenvalues, we can use (35) with v = u 
and no integrals, and get 


a ~~ 2. Kx |e, ¢; 82, ¢,---O¢ 6,181 Bece le> 
= ap <x |Sere; Sen es. Derg 18r- Sa <Et-- Sa11@ 


RAGA CeEsle> 

= 1<h4--- Sule. (51) 
We thus get the simple result that the probability of the &'s having the 
values &' is just the square of the modulus of the appropriate coordinate 
of the normalized ket vector corresponding to the state concerned. 

If the é’s do not all have discrete eigenvalues, but if, say, £,,..,, 
have discrete eigenvalues and £,,,,,..,£,, have continuous eigenvalues,, 
then to get something physically significant we must obtain the 
probability of each &, (r = 1,..,v7) having a specified value ¢ and each 
&, (s = v+1.,,..,u) lying in a specified small range & to £+d€,. For 
this purpose we must replace each factor 8, ¢, in (50) by a factor x,, 
which is that function of the observable €, which is equal to unity 
for ¢, within the range £, to ¢,+-d&, and zero otherwise. Proceeding 
as before with the help of (35), we obtain for this probability 


Pes g, Ubi by = [(E4--Eule> P dba dha. (52) 


Thus in every case the probability distribution of values for the £8 1s 
given by the square of the modulus of the representative of the norma- 
lized ket vector corresponding to the state concerned. 

The numbers which form the representative of a normalized ket 
(or bra) may for this reason be called probability amplitudes. The 
square of the modulus of a probability amplitude is an ordinary 
probability, or a probability per unit range for those variables that 
have continuous ranges of values. 

We may be interested in a state whose corresponding ket |x> cannot 
be normalized. This occurs, for example, if the state is an eigenstate 
of some observable belonging to an eigenvalue lying in a range of 
eigenvalues. The formula (51) or (52) can then still be used to give 
the relative probability of the £’s having specified values or having 
values lying in specified small ranges, i.e. it will give correctly the 
ratios of the probabilities for different £'’’s. The numbers Cae 
may then be called relative probability amplitudes. 
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The representation for which the above results hold is characterized 
by the basic vectors being simultaneous eigenvectors of all the é’s. 
It may also be characterized by the requirement that each of the £’s 
shall be represented by a diagonal matrix, this condition being easily 
seen to be equivalent to the previous one. The latter characterization 
is uswally the more convenient one. For brevity, we shall formulate 
it as each of the ¢’s ‘being diagonal in the representation’. 

Provided the é’s form a complete set of commuting observables, 
the representation is completely determined by the characterization, 
apart from arbitrary phase factors in the basic vectors. Each basic bra 
<é,...€,| may be multiplied by e*”’, where y’ is any real function of 
the variables &,..., &,, without changing any of the conditions which 
the representation has to satisfy, i.e. the condition that the é’s are 
diagonal or that the basic vectors are simultaneous eigenvectors of 
the é’s, and the fundamental properties of the basic vectors (34) and 
(35). With the basic bras changed in this way, the representative 
<é.-€,|P> of a ket |P> gets multiplied by e*”’, the representative 
<Q|&...€> of a bra <Q| gets multiplied by e-*” and the representa- 
tive <g1...€,,|0|€7...€,> of a linear operator « gets multiplied by e/’-7”, 
The probabilities or relative probabilities (51), (52) are, of course, 
unaltered. 

The probabilities that one calculates in practical problems in 
quantum mechanics are nearly always obtained from the squares 
of the moduli of probability amplitudes or relative probability ampli- 
tudes. Even when one is interested only in the probability of an 
incomplete set of commuting observables having specified values, it 
is usually necessary first to make the set a complete one by the 
introduction of some extra commuting observables and to obtain 
the probability of the complete set having specified values (as the 
square of the modulus of a probability amplitude), and then te sum 
or integrate over all possible values of the extra observables. A 
more direct application of formula (51) of § 13 is usually not 
practicable. 

To introduce a representation in practice 

(i) We look for observables which we would like to have diagonal, 
either because we are interested in their probabilities or for 
reasons of mathematical simplicity ; 

(ii) We must see that they all commute—a necessary condition 
since diagonal matrices always commute; 
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(iii) We then see that they form a complete commuting set, and 
if not we add some more commuting observables to them to 
make them into a complete commuting set ; 

(iv) We set up an orthogonal representation with this complete 
commuting set diagonal. 

The representation is then completely determined except for the 
arbitrary phase factors. For most purposes the arbitrary phase 
factors are unimportant and trivial, so that we may count the 
representation as being completely determined by the observables 
that are diagonal in it. This fact is already implied in our notation, 
since the only indication in a representative of the representation to 
which it belongs are the letters denoting the observables that are 
diagonal. 

It may be that we are interested in two representations for the 
same dynamical system. Suppose that in one of them the complete 
set of commuting observables ¢,,...,é, are diagonal and the basic 
bras are <é...é,| and in the other the complete set of commuting 
observables 7,,...,7,, are diagonal and the basic bras are <7}...7my|. 
A ket |P> will now have the two representatives <¢}...€,,|P> and 
<ny---My|P>. If €,,..,€, have discrete eigenvalues and €,.,,..,€, have 
continuous eigenvalues and if 7,.., 7, have discrete eigenvalues and 
Nr+1*+) Mp have continuous eigenvalues, we get from (35) 


anol P> = B, [of <r nbolet-Gd> Moar db, Ei €ulP>, (53) 
and interchanging é’s and 7’s 
E.ta\P> - i : \ CE Ful to? Inert (tr MwlP>. (54) 


These are the transformation equations which give one representative 
_ of |P> in terms of the other. They show that either representative 
is expressible linearly in terms of the other, with the quantities 


Graig, <Exfjloc1. (55) 


as coefficients. These quantities are called the transformation func- 
tions. Similar equations may be written down to connect the two 
representatives of a bra vector or of a linear operator. The trans- 
formation functions (55) are in every case the means which enable 
one to pass from one representative to the other. Each of the 
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transformation functions is the conjugate complex of the other, and 
they satisfy the conditions 

Def tint B> Moe be Ein baltsd 

id =a Sainte Onin (eva — Nesa)--O(ho— Nee) (56) 
and the corresponding conditions with ¢s and y’s interchanged, as 
may be verified from (35) and (34) and the corresponding equations 
for the 7’s. : 

Transformation functions are emma of probability amplitudes 
or relative probability amplitudes. Let us take the case when all the 
é’s and all the »’s have discrete eigenvalues. Then the basic ket 
[Ny+--Nw> is normalized, so that its representative in the ¢-representa- 
tion, ¢&...€,|n}...m,>, is a probability amplitude for each set of values 
for the £’s. The state to which these probability amplitudes refer, 
namely the state corresponding to |7}...7,,>, is characterized by the 
condition that a simultaneous measurement of 7),..., 7,, is certain to 
lead to the results 7,...,m,. Thus |<&...€,,|7}...7,,>|? is the proba- 
_ bility of the ¢’s having the values ¢;...¢,, for the state for which the 
7’s certainly have the values 7}...7/,. Since 


1 Sel mimo? = [ni ml Ed Ps 


we have the theorem of ii, sia: probability of the &’s iening 
the values &' for the state for which the n’s certainly have the values 7! 
as equal to the probability of the y's having the values 7’ for the state for 
which the &’s certainly have the values €’. 

If all the 7’s have discrete eigenvalues and some of the ¢’s have 
continuous eigenvalues, |¢£)...£,,|74.-.7,>|? still gives the probability 
distribution of values for the ¢’s for the state for which the 7’s cer- 
tainly have the values 7’. If some of the n’s have continuous eigen- 
values, |7}...%,> is not normalized and |<£,...£,,|n}...7%,)|? then gives 
only the relative probability distribution of values for the é’s for the 
state for which the y’s certainly have the values 7’. 


19. Theorems about functions of observables 

We shall illustrate the mathematical value of representations by 
using them to prove some theorems. 

THEOREM 1. A linear operator that commutes with an observable g 
commutes also with any function of €. 

The theorem is obviously true when the function is expressible as 
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a power series. To prove it generally, let w be the linear operator, 
so that we have the equation 

fw—w& = 0. (57) 
Let us introduce a representation in which é is diagonal. If ¢ by 
itself does not form a complete commuting set of observables, we must 
make it into a complete commuting set by adding certain observables, 
B say, to it, and then take the representation in which € and the f’s 
are diagonal. (The case when ¢ does form a complete commuting set 
by itself can be looked upon as a special case of the preceding one 
with the number of 8 variables zero.) In this representation equation 


. ire, CEB’ [feo —wwb "B"> = 0, 
which reduces to 
ESB | |E"B"> — CEB |w | E"B" DE” = 0. 
In the case when the eigenvalues of & are discrete, this equation 
shows that all the matrix elements <£’B’|w|é"B”> of w vanish except 


those for which é’ = €”. In the case when the eigenvalues of & are 
continuous it shows, like equation (48), that <£’B’|w|é”B”> is of the 


ang <E'B’ ol "B"> = cB(E'—£"), 

where c is some function of é’ and the f’’s and f”’s. In either case 
we may say that the matrix representing w ‘is diagonal with respect 
to £’. If f(€) denotes any function of € in accordance with the general 
theory of § 11, which requires f(é”) to be defined for é” any eigeuemios 
of £, we can deduce in either case 


LEVER ew |E"B"> — CEB’ |w |E"B” >F(E") = 0. 
This gives EB I F(E) o—wf(E)|E"B"> = 9, 
so that f(é)o—wf(é) = 0 


and the theorem is proved. 

As a special case of the theorem, we have the result that any 
observable that commutes with an observable é also commutes with 
any function of €. This result appears as a physical necessity when 
we identify, as in § 13, the condition of commutability of two 
observables with the condition of compatibility of the correspond- 
ing observations. Any observation that is compatible with the 
measurement of an observable € must also be compatible with the 
measurement of f(£), since any measurement of € includes in itself 
a measurement of f(€). 
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THEOREM 2. A linear operator that commutes with each of a complete 
set of commuting observables is a function of those observables. 

Let w be the linear operator and &,, ,...,€, the complete set of 
commuting observables, and set up a representation with these 
observables diagonal. Since w commutes with each of the é’s, the 
matrix representing it is diagonal with respect to each of the &’s, 
by the argument we had above. This matrix is therefore a diagonal 
matrix and is of the form (49), involving a number c’ which is a 
function of the é’’s. It thus represents the function of the é’s that 
c’ is of the €’’s, and hence w equals this function of the é’s. 

THEOREM 3. Jf an observable € and a linear operator g are such that 
any linear operator that commutes with — also commutes with g, then g 
is a function of €. 

This is the converse of Theorem 1. To prove it, we use the same 
representation with é diagonal as we had for Theorem 1. In the first 
place, we see that g must commute with € itself, and hence the 
representative of g must be diagonal with respect to €, i.e. it must 
be of the form . 


CEB Igl"B"> = a(E'B'B" See or a(é’B'B")3(E’—£"), 
according to whether é has discrete or continuous ie Now 
let w be any linear operator that commutes with ¢, so that its 
representative is of the form 


CEB" eo |E"B"> = O(E'B'B’ dee or b(E’B'B")3(E’—&"). 
By hypothesis w must also commute with g, so a 
<'B'|gw —wg |E"B"> = 0. (58) 
If we suppose for definiteness that the f’s have discrete eigenvalues, 
(58) leads, with the help of the law of matrix multiplication, to 
2 {a(é’B’B”’)b(E’B'"B”) — b(E’B'B’” a(E’B’”B”)} =a 0, a (59) 


the left-hand side of (58) being equal to the left-hand side of (59) 
multiplied by 8, or 8(é’—£”). Equation (59) must hold for all 
functions 6(€’B’8”). We can deduce that 

a(g’B'B") = 0 for Bp’ +f", 

a(g’B’B’) = a(f’B"B"). 
The first of these results shows that the matrix representing g is 
diagonal and the second shows that a(é’8’B’) is a function of &’ only. 
We can now infer that g is that function of € which a(é’p’B’) is of €’, 
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so the theorem is proved. The proof is analogous if some of the 6’s 
have continuous eigenvalues. 

Theorems 1 and 3 are still valid if we replace the observable £ by 
any set of commuting observables &1,£2,..,€, only formal changes 
being needed in the proofs. 


20. Developments in notation 

The theory of representations that we have developed provides a_ 
general system for labelling kets and bras. In a representation in which 
the complete set of commuting observables 6,...., £,, are diagonal any 
ket | P> will have a representative <£}...¢,,|P), or <é’|P) for brevity. 
This representative is a definite function of the variables ¢’, say w(é’). 
The function % then determines the ket |P) completely, so it may be 
used to label this ket, to replace the arbitrary label P. In symbols, 
if IP) = He’) | (0 
we put |P> = |p(€)>. 
We must put |P> equal to |%(£)> and not |(é’)>, since it does not 
depend on a particular set of eigenvalues for the ¢’s, but only on the 
form of the function i. 

With f(€) any function of the observables &,,...,&,, f(€)|P> will 
have as its representative 


COUPE) PD) = FEE’). 
Thus according to (60) we put 


FE)|P> = [FENBE)>. 
With the help of the second of equations (60) we now get 


SHWE) = [fEWE). (61) 

This is a general result holding for any functions f and ¢ of the ¢’s, 
and it shows that the vertical line | is not necessary with the new 
notation for a ket—either side of (61) may be written simply as 
FS(€)4(€)>. Thus the rule for the new notation becomes :— 
if IP) = He’) | (on 
we put |P> = $(€)>. 
We may further shorten ¥(&)> to >, leaving the variables ¢ under- 
stood, if no ambiguity arises thereby. 

The ket %(€)> may be considered as the product of the linear 
operator 4(&) with a ket which is denoted simply by > without a 
label. We call the ket > the standard ket. Any ket whatever can be 
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expressed as a function of the é’s multiplied into the standard ket. 
For example, taking |P> in (62) to be the basic ket |€">, we find 

[€"> = 84, ¢1--3e,¢5 8(Eo41—So41)--8(E,— Eu)? (63) 
in the case when &,,.., €, have discrete eigenvalues and é,,,,,..,€,, have 
continuous eigenvalues. The standard ket is characterized by the 
condition that its representative <é’|> is unity over the whole domain 
of the variable é’, as may be seen by putting ~ = 1 in (62). 

A further contraction may be made in the notation, namely to 
leave the symbol > for the standard ket understood. A ket is then 
written simply as (é), a function of the observables €. A function 
of the £’s used in this way to denote a ket is called a wave function.t 
The system of notation provided by wave functions is the one usually 
used by most authors for calculations in quantum mechanics. In 
using it one should remember that each wave function is understood 
to have the standard ket multiplied into it on the right, which 
prevents one from multiplying the wave function by any operator 
on the right. Wave functions can be multiplied by operators only on 
the left. This distinguishes them from ordinary functions of the é’s, 
which are operators and can be multiplied by operators on either the 
left or the right. A wave function is just the representative of a ket 
expressed as a function of the observables €, instead of eigenvalues €’ 
for those observables. The square of its modulus gives the proba- 
bility (or the relative probability, if it is not normalized) of the é’s 
having specified values, or lying in specified small ranges, for the 
corresponding state. 

The new notation for bras may be developed in the same way as 
for kets. A bra <Q| whose representative <Q|é’> is d(é’) we write 
<¢(€)|. With this notation the conjugate imaginary to [%(€)> is 
<#(£)|. Thus the rule that we have used hitherto, that a ket and 
its conjugate imaginary bra are both specified by the same label, 
must be extended to read—if the labels of a ket involve complex 
numbers or complex functions, the labels of the conjugate imaginary 
bra involve the conjugate complex numbers or functions. As in the 
case of kets we can show that <(&)|f(€) and <(€)f(€)| are the same, 
so that the vertical line can be omitted. We can consider <4(&) as 
the product of the linear operator ¢(€) into the standard bra <, which 

t+ The reason for this name is that in the early days of quantum mechanics all the 


examples of these functions were of the form of waves. The name is not a descriptive 
one from the point of view of the modern general theory. 
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is the conjugate imaginary of the standard ket >. We may leave 
the standard bra understood, so that a general bra is written as 4(E), 
the conjugate complex of a wave function. The conjugate complex . 
of a wave function can be multiplied by any linear operator on the 
right, but cannot be multiplied by a linear operator on the left. We 
can construct triple products of the form <f(é)>. Such a triple product 
is a@ number, equal to f(é) summed or integrated over the whole 
domain of eigenvalues for the €’s, 


Mr = > ff ME) dbo dé (64) 
f.8b 


in the case when &,,..,é, have discrete eigenvalues and é,,,,...,£, have 
continuous eigenvalues. 

The standard ket and bra are defined with respect to a representa- 
tion. If we carried through the above work with a different repre- 
sentation in which the complete set of commuting observables 7 are 
diagonal, or if we merely changed the phase factors in the representa- 
tion with the é’s diagonal, we should get a different standard ket and 
bra. In a piece of work in which more than one standard ket or bra 
appears one must, of course, distinguish them by giving them labels. 

A further development of the notation which is of great importance 
for dealing with complicated dynamical systems will now be discussed. 
Suppose we have a dynamical system describable in terms of dynami- 
cal variables which can all be divided into two sets, set A and set B. 
say, such that any member of set A commutes with any member of 
set B. A general dynamical variable must be expressible as a function 
of the A-variables and B-variables together. We may consider 
another dynamical system in which the dynamical variables are the 
A-variables only—let us call it the A-system. Similarly we may 
consider a third dynamical system in which the dynamical variables 
are the B-variables only—the B-system. The original system can 
then be looked upon as a combination of the A-system and the 
B-system in accordance with the mathematical scheme given below. 

Let us take any ket |a> for the A-system and any ket |b> for the 
B-system. We assume that they have a product |a>|b> for which 
the commutative and distributive axioms of multiplication hold, i.e. 


ja>|b> = |b>|a>, 
{c, |ay>+-¢2|a_>}|b> a C,|4,>|b>+-c,|a_> 15>, 
|a>{cy [b> +C_|b2>} = ¢,{a>|b,>+¢,|@>|b2), 
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the c’s being numbers. We can give a meaning to any A-variable 
operating on the product |a>|b> by assuming that it operates only 
on the |a> factor and commutes with the |b) factor, and similarly 
we can give a meaning to any B-variable operating on this product 
by assuming that it operates only on the |b> factor and commutes 
with the |@> factor. (This makes every A-variable commute with 
every B-variable.) Thus any dynamical variable of the original 
system can operate on the product |a>|b), so this product can be 
looked upon as a ket for the original system, and may then be 
written |ab>, the two labels a and b being sufficient to specify it. 
In this way we get the fundamental equations 

la>|b> = [b>la> = |aby. (65) 

The multiplication here is of quite a different kind from any that 
occurs earlier in the theory. The ket vectors |a> and |b» are in two 
different vector spaces and their product is in a third vector space, 
which may be called the product of the two previous vector spaces. 
The number of dimensions of the product space is equal to the 
product of the number of dimensions of each of the factor spaces. 
A general ket vector of the product space is not of the form (65), but 
is a sum or integral of kets of this form. 

Let us take a representation for the A-system in which a complete 
set of commuting observables ¢, of the A-system are diagonal. We 
shall then have the basic bras <<’, | for the A-system. Similarly, taking 
a representation for the B-system with the observables €z diagonal, 
we shall have the basic bras <£4| for the B-system. The products 


<Eal<Esl = <€4Es| (66) 
will then provide the basic bras for a representation for the original 
system, in which representation the ¢ ,’s and the z's will be diagonal. 


The ¢,’s and £,’s will together form a complete set of commuting 
observables for the original system. From (65) and (66) we get 


<4la><ERlb> = <4 Exlad), (67) 
showing that the representative of |2b> equals the product of the 
representatives of |a> and of |b) in their respective representations. 

We can introduce the standard ket, >, say, for the A-system, 
with respect to the representation with the €,'8 diagonal, and also 
the standard ket >, for the B-system, with respect to the repre- 
sentation with the £,’s diagonal. Their product >,>, is then the 
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standard ket for the original system, with respect to the representa- 
tion with the €,’s and €,,’s diagonal. Any ket for the original system 


may be expressed as WE, En) 4>e- (68) 


It may be that in a certain calculation we wish to use a particular 
representation for the B-system, say the above representation with 
the £,’s diagonal, but do not wish to introduce any particular 
representation for the A-system. It would then be convenient to 
use the standard ket >, for the B-system and no standard ket for 
the A-system. Under these circumstances we could write any ket 
for the original system as ope (69) 


in which |£,> is a ket for the A-system and is also a function of the 
&,'s, i.e. it is a ket for the A-system for each sct of values for the 
&,’s—in fact (69) equals (68) if we take 


Ifn> = ¥E4En)>4- 
We may leave the standard ket >, in (69) understood, and then we 
have the general ket for the original system appearing as |£,>, a ket 
for the A-system and a wave function in the variables €, of the 
B-system. An example of this notation will be used in § 66. 

The above work can be immediately extended to a dynamical 
system describable in terms of dynamical variables which can be 
divided into three or more sets A, B, C,... such that any member of 
one set commutes with any member of another. Equation (65) gets 


generalized to |a)|b>|c>... = |abe...>, 


the factors on the left being kets for the component systems and 
the ket on the right being a ket for the original system. Equations 
(66), (67), and (68) get generalized to many factors in a similar way. 


é 


IV 
THE QUANTUM CONDITIONS 


21. Poisson brackets 

Our work so far has consisted in setting up a general mathematical 
scheme connecting states and observables in quantum mechanics. 
One of the dominant features of this scheme is that observables, and 
dynamical variables in general, appear in it as quantities which do 
not obey the commutative law of multiplication. It now becomes 
necessary for us to obtain equations to replace the commutative law 
of multiplication, equations that will tell us the value of £y7—7& when 
€ and 7 are any two observables or dynamical variables. Only when 
such equations are known shall we have a complete scheme ‘of 
mechanics with which to replace classical mechanics. These new 
equations are called quantum conditions or commutation relations. 

The problem of finding quantum conditions is not of such a general 
character as those we have been concerned with up to the present. It 
is instead a special problem which presents itself with each particular 
dynamical system one is called upon to study. There is, however, 
a fairly general method of obtaining quantum conditions, applicable 
to a very large class of dynamical systems. This is the method of 
classical analogy and will form the main theme of the present chapter. 
Those dynamical systems to which this method is not applicable 
must be treated individually and special considerations used in each 
case. 

The value of classical analogy in the development of quantum 
mechanics depends on the fact that classical mechanics provides a 
valid description of dynamical systems under certain conditions, 
when the particles and bodies composing the systems are sufficiently 
massive for the disturbance accompanying an observation to be 
negligible. Classical mechanics must therefore be a limiting case of 
quantum mechanics. We should thus expect to find that important 
concepts in classical mechanics correspond to important concepts in 
quantum mechanics, and, from an understanding of the general 
nature of the analogy between classical and quantum mechanics, we 
may hope to get laws and theorems in quantum mechanics appearing 
as simple generalizations of well-known results in classical mechanics; 
in particular we may hope to get the quantum conditions appearing 
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as a simple generalization of the classical law that all dynamical 
variables commute. 

Let us take a dynamical system composed of a number of particles 
in interaction. As independent dynamical variables for dealing with 
the system we may use the Cartesian coordinates of all the particles 
and the corresponding Cartesian components of velocity of the par- 
ticles. It is, however, more convenient to work with the momentum 
components instead of the velocity components. Let us call the 
coordinates qg,, r going from 1 to three times the number of particles, 
and the corresponding momentum components p,. The q’s and p’s 
are called canonical coordinates and momenta. 

The method of Lagrange’s equations of motion involves introdu- 
cing coordinates g, and momenta p, in a more general way, applicable 
also for a system not composed of particles (e.g. a system containing 
rigid bodies). These more general q’s and p’s are also called canonical 
coordinates and momenta. Any dynamical variable is expressible in 
terms of a set of canonical coordinates and momenta. 

An important concept in general dynamical theory is the Poisson 
Bracket. Any two dynamical variables u and v have a P.B. (Poisson 
Bracket) which we shall denote by [u,v], defined by 


du dv ou ov 
[mo] » = cp, OD, = -) 
u and v being regarded as functions of a set of canonical coordinates 
and momenta q, and p, for the purpose of the differentiations. The 
right-hand side of (1) is independent of which set of canonical 
coordinates and momenta are used, this being a consequence of the 
general definition of canonical coordinates and momenta, so the 
P.B. [u,v] is well defined. 
The main properties of P.B.s, which follow at once from their 
detinition (1), are 
[u,v] = —[v, w], (2) 


[u,e] = 0, (3) 


where c is a number (which may be considered as a special case of a 
dynamical variable), 


[uy +u2, 0] = [uy,2]+[u2, »], | (4) 
[u, %+.2] = [w, 7, ]+[u, v2], 


3895.57 a 
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3 OU. _ OUg\ ov OU, _, OUs\ Ov 
[wy Ua, 0] = Z (a Ug Uy a Fa Ug Uy ol 
= [uy 0] +u[ M2, v], (5) 
: [w, vy V2] = [w, ry ]v,+v,[w, v2]. 

Also the identity 

[w,[v, w]]+[o, [w, «]]+[e, [u,v] = 0 (6) 
is easily verified. Equations (4) express that the P.B. [w,v] involves 
wu and » linearly, while equations (5) correspond to the ordinary rules 
for differentiating a product. 

Let us try to introduce a quantum P.B. which shall be the analogue 
of the classical one. We assume the quantum P.B. to satisfy all the 
conditions (2) to (6), it being now necessary that the order of the 
factors u, and w, in the first of equations (5) should be preserved 
throughout the equation, as in the way we have here written it, and 
similarly for the v, and v, in the second of equations (5). These condi- 
tions are already sufficient to determine the form of the quantum 
P.B. uniquely, as may be seen from the following argument. We can 
evaluate the P.B. [u, w., v, v| in two different ways, since we can use 
either of the two formulas (5) first, thus, 


[by Ua, Vy Vo] = [y, Vy Vole g+Uy[ Mp, Vy Vo] 
= {[uq, Mee + y(t, Mg] }Ugt-Uy{[Ue, Vy Very [U%9, V9} 


= [Uy, yg Ug + y[ Uy, Ve lg Uy [Me Vy |og + ey Vy[Uy, Ve] 


and 
[uy Me, My V2] = [My Ue, My }y2+2,[ 4 Ue, V2] 
= [Uy My ]og Va + Uy[ Me, Vy }¥2 +O s[ Uy, Ve |g +1 Uy Mo, V2]. 
Equating these two results, we obtain 
[uM ](ue Va— Vy Ue) = (Uy ¥~— Vy My) | He, Ve]. 
Since this condition holds with uw, and v, quite independent of u, and 
Vg, we must have 


® 


Up Ve—Vq Uz = ti[ Ug, Ve], 

where 7% must not depend on w, and v,, nor on wu, and v,, and also 
must commute with (w,v7;—v,%,). It follows that % must be simply 
a number. We want the P.B. of two real variables to be real, as in 
the classical theory, which requires. from the work at the top of p. 28, 
that % shall be a real number when introduced, as here, with the 
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coefficient i. We are thus led to the following definition for the 
quantum P.B. [u,v] of any two variables u and v, 

uv—vu = ih[u, v], (7) 
in which # is a new universal constant. It has the dimensions of 
action. In order that the theory may agree with experiment, we 
must take % equal to h/27, where h is the universal constant that 
was introduced by Planck, known as Planck’s constant. It is easily 
verified that the quantum P.B. satisfies all the conditions (2), (3), (4), 
(5), and (6). 

The problem of finding quantum conditions now reduces to the 
problem of determining P.B.s in quantum mechanics. The strong 
analogy between the quantum P.B. defined by (7) and the classical 
P.B. defined by (1) leads us to make the assumption that the quantum 
P.B.s, or at any rate the simpler ones of them, have the same values 
as the corresponding classical P.B.s. The simplest P.B.s are those 
involving the canonical coordinates and momenta themselves and 
have the following values in the classical theory : 


[974s = 0, [Prs Pe] = 0, (8) 
[9,Ps] = Ors: 

We therefore assume that the corresponding quantum P.B.s also 

have the values given by (8). By eliminating the quantum P.B.s 

with the help of (7), we obtain the equations 


GIs—Is Ir = 9, P,Ps—PsP, = 9, (9) 
IPs — Ps = M8,,, 
which are the fundamental quantum conditions. They show us where 
the lack of commutability among the canonical coordinates and 
momenta lies. They also provide us with a basis for calculating com- . 
mutation relations between other dynamical variables. For instance, 
if £ and 7 are any two functions of the q’s and p’s expressible as 
power series, we may express £n—7 or [€, yn], by repeated applica- 
tions of the laws (2), (3), (4), and (5), in terms of the elementary 
P.B.s given in (8) and so evaluate it. The result is often, in simple 
cases, the same as the classical reswt, or departs from the classical 
result only through requiring a special order for factors in a product, 
this order being, of course, unimportant in the classical theory. Even 
when ¢ and 7 are more general functions of the g’s and p’s not ex- 
pressible as power series, equations (9) are still sufficient to fix the 
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value of £)—yn£, as will become clear from the following work. 
Equations (9) thus give the solution of the problem of finding the 
quantum conditions, for all those dynamical systems which have a 
classical analogue and which are describable in terms of canonical 
coordinates and momenta. This does not include all possible systems 
in quantum mechanics. 

Equations (7) and (9) provide the foundation for the analogy 
between quantum mechanics and classical mechanics. They show 
that classical mechanics may be.regarded as the limiting case of quantum 
mechanics when hi tends to zero. A P.B. in quantum mechanics is a 
purely algebraic notion and is thus a rather more fundamental con- 
cept than a classical P.B., which can be defined only with reference to 
a set of canonical coordinates and momenta. For this reason canonical 
coordinates and momenta are of less importance in quantum mechanics 
than in classical mechanics; in fact, we may have a system in quan- 
tum mechanics for which canonical coordinates and momenta do 
not exist and we can still give a meaning to P.B.s. Such a system 
would be one without a classical analogue and we should not be able 
to obtain its quantum conditions by the method here described. 

From equations (9) we see that two variables with different suffixes 
ry and s always commute. It follows that any function of q, and p, 
will commute with any function of g, and p, when s differs from r. 
Different values of r correspond to different degrees of freedom of the 
dynamical system, so we get the result that dynamical variables 
referring to different degrees of freedom commute. This law, as we have 
derived it from (9), is proved only for dynamical systems with 
classical analogues, but we assume it to hold generally. In this way 
we can make a start on the problem of finding quantum conditions 
for dynamical systems for which canonical coordinates and momenta 
do not exist, provided we can give a meaning to different degrees of 
freedom, as we may be able to do with the help of physical insight. 

We can now see the physical meaning of the division, which was 
discussed in the preceding section, of the dynamical variables into 
sets, any member of one set commuting with any member of another. 
Kach set corresponds to certain degrees of freedom, or possibly just 
one degree of freedom. The division may correspond to the physical 
process of resolving the dynamical system into its constituent parts, 
each constituent being capable of existing by itself as a physical 
system, and the various constituents having to be brought into 
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interaction with one another to produce the original system. Alterna- 
tively the division may be merely a mathematical procedure of 
resolving the dynamical system into degrees of freedom which cannot 
be separated physically, e.g. the system consisting of a particle with 
internal structure may be divided into the degrees of freedom describ- 
ing the motion of the centre of the particle and those describing the 
internal structure. 


22. Schrédinger’s representation 

Let us consider a dynamical system with n degrees of freedom 
having a classical analogue, and thus describable in terms of canonical 
coordinates and momenta q,,p, (r = 1,2,...,n). We assume that the 
coordinates q, are all observables and have continuous ranges of ergen- 
values, these assumptions being reasonable froin the physical signifi- 
cance of the q’s. Let us set up a representation with the q’s diagonal. 
The question arises whether the q’s form a complete commuting set 
for this dynamical system. It seems pretty obvious from inspection 
that they do. We shall here assume that they do, and the assumption 
will be justified later (see top of p. 92). With the q’s forming a 
complete commuting set, the representation is fixed except for the 
arbitrary phase factors in it. 

Let us consider first the case of n = 1, so that there is only one q 


and p, satisfying gp—pq = th. (10) 


Any ket may be written in the standard ket notation %(q)>. From it 
we can form another ket dys/dq>, whose representative is the deriva- 
tive of the original one. This new ket is a linear function of the 
original one and is thus the result of some linear operator applied to 
the original one. Calling this linear operator d/dq, we have 


d ds 

S| > =). 1] 

Gg? = da (11) 
Equation (11) holding for all functions ~ defines the linear operator 
d/dq. We have F 


dq 

Let us treat the linear operator d/dg according to the general theory 

of linear operators of § 7. We should then be able to apply it to a bra 
<(q), the product (¢d/dq being defined, according to (3) of § 7, by 


(atl = lem] (13) 


e0. (12) 
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for all functions (gq). Taking representatives, we get 


d ie big’) da’ Ue) 
=-|q'> d =: Le (14) 
| za q ¥(q') aa ™ 
We can transform the right-hand side by partial integration and get 
d , , , [42 U , 
Beta = | ee (15) 
[ Zia q' ¥(7') age “2 Ha) 
provided the contributions from the limits of integration vanish. 
This gives ad ee _ db(q) 
dq dq’ ’ 
d dg 
showing that == —. (16) 
g ‘Sag Ga 


Thus djdq operating to the left on the conjugate complex of a wave 
function has the meaning of minus differentiation with respect to q. 

The validity of this result depends on our being able to make the 
passage from (14) to (15), which requires that we must restrict our- 
selves to bras and kets corresponding to wave functions that satisfy 
suitable boundary conditions. The conditions usually holding in 
practice are that they vanish at the boundaries. (Somewhat more 
general conditions will be given in the next section.) These conditions 
do not limit the physical applicability of the theory, but, on the con- 
trary, are usually required also on physical grounds. For example, 
if q is a Cartesian coordinate of a particle, its eigenvalues run from 
oo to oo, and the physical requirement that the particle has zero 
probability of being at infinity leads to the condition that the wave 
function vanishes for g = -Loo. 

The conjugate complex of the linear operator d/dq can be evaluated 
by noting that the conjugate imaginary of d/dq.%> or dib/dq> is 
<dib/dq, or —<bd/dq from (16). Thus the conjugate complex of* d/dq 
is —d/dq, so d/dq is a pure imaginary linear operator. 

To get the representative of d/dq we note that, from an application 
of formula (63) of § 20, 


lg”> = 8(¢—q")>, (17) 
d d 
that —|q"> = — d(g—a" 
so tha agi? Fe (q=¢")>, (18) 
and hence @iZle> = iqt-2") (19) 


The representative of d/dg involves the derivative of the 5 function. 
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Let us work out the commutation relation connecting d/dg with q. 
We have 


a 
Since this holds for any ket 4, we have 
d d 
oe (‘a 1. (21) 


Comparing this result with (10), we see that —ifid/dq satisfies the 
same commutation relation with q that p does. 

To extend the foregoing work to the case of arbitrary n, we write 
the general ket as #(g,...¢,)> = > and introduce the n linear opera- 
tors 0/éq, (r = 1,...,), which can operate on it in accordance with 
the formula 


a ra 
4 = 5, 22 
corresponding to (11). We have 
a) 
= =? (23) 


corresponding to (12). Provided we restrict ourselves to bras and 
kets corresponding to wave functions satisfying suitable boundary 
conditions, these linear operators can operate also on bras, in accor- 
dance with the formula & 3 ad 
04, og, 
corresponding to (16). Thus é/ag, can operate to the left on the 
conjugate complex of a wave function, when it has the meaning of 
minus partial differentiation with respect to g,. We find as before 
that each 0/éq, is a pure imaginary linear operator. Corresponding 
to (21) we have the commutation relations 
@ 7) 


(24) 


aegis. 8... 25 
ag, 8 "Gq, Oye (25) 
We have further 
aa a aa 
a ro ee (26) 
6g, OF, ” oq, 2g,” Fs aq," 
090 
showing that /__ 34 (27) 


Gq, 9, 25 24, 
Comparing (25) and (27) with (9), we see that the linear operators 
— i d/0q, satisfy the same commutation relations with the q’s and with 
each other that the p's do. 
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It would be possible to take 
Pr = —h 8/04, (28) 

without getting any inconsistency. This possibility enables us to see 
that the q’s must form a complete commuting set of observables, 
since it means that any function of the q’s and p’s could be taken 
to be a function of the q’s and —2# /éq’s and then could not commute 
with all the q’s unless it is a function of the q’s only. 

The equations (28).do not necessarily hold. But in any case the 
quantities p,+-i% 0/q, each commute with all the q’s, so each of them 
is a function of the q’s, from Theorem 2 of § 19. Thus 


Py = —th o/eq, +f,(q). (29) 
Since p, and —ih@/dq, are both real, f.(q¢) must be real. For any 
function f of the q’s we have 


a re: of 
aq,f¥ = fag +3q%> 
showing that sa f—f EC (30) 


eq, * 8g, aq, 
With the help of (29) we can now deduce the general formula 


Pf—fp, == —th of/0q,. (31) 
This formula may be written in P.B. notation 
Lf, P,] = of/04,, (32) 


when it is the same as in the classical theory, as follows from (1). 
Multiplying (27) by (—7%)? and substituting for —ih d/aq, and —ih d/éq, 
their values given by (29), we get 

(OJ-)@—],) — (P.—f.)(0,—f,), 
which reduces, with the help of the quantum condition D.D, = P,),,t0 


Prfsthr Ps = Pb tts Pr 
This reduces further, with the help of (31), to 


&f./0q, = ef,/2g5, (33) 
showing that the functions f, are all of the form 
f, = oF [aq, (34) 
with F independent of r. Equation (29) now becomes 
Py = —thé/oq,+ oF /éq,. (35) 


We have been working with a representation which is fixed to the 
extent that the g’s nust be diagonal in it, but which contains arbitrary 
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phase factors. If the phase factors are changed, the operators 0/éq, 
get changed. It will now be shown that, by a suitable change in the 
phase factors, the function F in (35) can be made to vanish, so that 
equations (28) are made to hold. 

Using stars to distinguish quantities referring to the new repre- 
sentation with the new phase factors, we shall have the new basic 
bras connected with the previous ones by 

Te Gn* | = CC". (36) 
where y’ = y(q') is a real function of the g’’s. The new representa- 
tive of a ket is e'” times the old one, showing that e%)>* = >, so 


ve get , 
— * = etry (37) 


as the connexion between the new standard ket and the original one. 
The new linear operator (0/éq,)* satisfies, corresponding to (22), 


OY oy 
\* = == — ey 
hap ? 2, aq,” 
with the help of (37). Using (22), this ioe 
(-) —* — = cv) = = am emp 


showing that (a) = ag e”, (38) 
cd ©, 
or, with the help of (30), 
O\* 2 . ey 
lan) acta, oo 
By choosing y so that F = hy+ a constant, (40) 
(35) becomes Pp, = —th(0/0q,)*. (41) 


Equation (40) fixes y except for an arbitrary constant, so the repre- 
sentation is fixed except for an arbitrary constant phase factor. 

In this way we see that a representation can be set up in which 
the q’s are diagonal and equations (28) hold. This representation is 
a very useful one for many problems. It will be called Schrédinger’s 
representation, as it was the representation in terms of which Schro- 
dinger gave his original formulation of quantum mechanics in 1926. 
Schrédinger’s representation exists whenever one has canonical q’s 
and p’s, and is completely determined by these q’s and p’s except for 
an arbitrary constant phase factor. It owes its great convenience to 
its allowing one to express immediately any algebraic function of the 
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q’s and ’s of the form of a power series in the p’s as an operator of 
differentiation, e.g. if f(q,,.--;9n; Pi» Pn) IS Such a function, we have 


I Gps GPs) Pr) aa | AUP In» —mh 0/2Q1,+-+ —th 0/04n); (42) 


provided we preserve the order of the factors in a product on substi- 
tuting the —7%d/dq’s for the p’s. 
From (23) and (28), we have 


p= 0. (43) 


Thus the standard ket in Schrédinger’s representation is characterized 
by the condition that it is a simultaneous eigenket of all the momenta 
belonging to the eigenvalues zero. Some properties of the basic 
vectors of Schrédinger’s representation may also be noted. Equation 
(22) gives 


— Oxf Op (Qy.. In) = é , , 

G+ In| =— aq, oy = hi nl 5 a. aq ag. Oi Inl>- 
Hence <q — z <G4-+-Gn| (44) 
Q1°+-In aq, og T1++-Fn|> 

, , ° 4] , , 
so that <Qi-nge| p= lr a ee (45) 
Le 


Similarly, equation (24) leads to 
, , Le i 7) , , > 
PrlQa---In> as nh, 191-+-In>- (46) 
9, 


23. The momentum representation 
Let us take a system with one degree of freedom, describable in 
terms of a q and p with the eigenvalues of g running from —oo to 0, 
and let us take an eigenket |p’> of p. Its representative in the Schré- 
dinger representation, <q'|p’>, satisfies : 
<a ip") = pip’ = — ih ap», 


with the help of (45) applied to the case of one degree of freedom. 
The solution of this differential equation for <q’ |p’) is 


<q'|p'> = c' efr'ain, (47) 
where c’ = c(p’) is independent of q’, but may involve 7’. 


The representative <q’ |p’) does not satisfy the boundary conditions 
of vanishing at q’ = too. This gives rise to some difficulty, which 
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shows itself up most directly in the failure of the orthogonality 
theorem. If we take a second eigenket |p”> of p with representative 
<q'|p"> = c" etpraih, 
belonging to a different eigenvalue p”, we shall have 
<p'lp"> = | <p'lg’> da’ <a'|p"> = ec" [ e-tw’-PrWih dg’, (48) 
This integral does not converge according to the usual definition of 
convergence. To bring the theory into order, we adopt a new defini- 
tion of convergence of an integral whose domain extends to infinity, 
analogous to the Cesaro definition of the sum of an infinite series. 
With this new definition, an integral whose value to the upper limit 
q’ is of the form cosaq’ or sinag’, with a a real number not zero, is 
counted as zero when q’ tends to infinity, i.e. we take the mean value 
of the oscillations, and similarly for the lower limit of g’ tending to 
minus infinity. This makes the right-hand side of (48) vanish for 
p” #>p’',so that the orthogonality theorem is restored. Also it makes 
the right-hand sides of (13) and (14) equal when <¢ and > are eigen- 
vectors of p, so that eigenvectors of p become permissible vectors to 
use with the operator d/dg. Thus the boundary conditions that the 
representative of a permissible bra or ket has to satisfy become 
extended to allow the representative to oscillate like cosagq’ or sinaq’ 
as q’ goes to infinity or minus infinity. 
For p” very close to p’, the right-hand side of (48) involves a 6 
function. To evaluate it, we need the formula 
[ef dar = 2m 5(a) (49) 
—2o 
for real a, which may be proved as follows. The formula evidently 
holds for a different from zero, as both sides are then zero. Further 
we have, for any continuous function f(a), 


[t@ da fe Ye fs) da 2a-!sinag = 27f(0) 


in the limit when g tends to infinity. A more complicated argument 
shows that we get the same result if instead of the limits g and —g 
we put g, and —g,, and then let g, and g, tend to infinity in different 
ways (not too widely different). This shows the equivalence of both 
sides of (49) as factors in an integrand, which proves the formula. 
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With the help of (49), (48) becomes 


<p'|p"> — cc" Qn 8[ (p’—p”)/%] = c’c" h8(p’—p”) 
= |e2h3(p'—p"). (50) 


We have obtained an eigenket of p belonging to any real eigenvalue 
p’, its representative being given by (47). Any ket |X> can be ex- 
panded in terms of these eigenkets of p, since its representative 
<q'|X> can be expanded in terms of the representatives (47) by 
Fourier analysis. It follows that the momentum p is an observable, 
in agreement with the experimental result that momenta can be 
observed. 

A symmetry now appears between g and p. Each of them is an 
observable with eigenvalues extending from —0co to oo, and the 
commutation relation connecting g and p, equation (10), remains 
invariant if we interchange q and p and write —7 for 7. We have set 
up a representation in which q is diagonal and p = —ihd/dg. It 
follows from the symmetry that we can also set up a representation 
in which p is diagonal and 

q = thdldp, (51) 


the operator d/dp being defined by a procedure similar to that used 
for d/dq. This representation will be called the momentum representa- 
tion. It is less useful than the previous Schrédinger representation 
because, while the Schrodinger representation enables one to express 
as an operator of differentiation any function of g and p that is a 
power series in p, the momentum representation enables one so to 
express any function of q and p that is a power series in g, and the 
important quantities in dynamics are almost always power series in 
p but are often not power series in g. All the same the momentum 
representation is of value for certain problems (see § 50). 

Let us calculate the transformation function <q’ |p’> connecting the 
two representations. The basic kets |p’) of the momentum representa- 
tion are eigenkets of p and their Schrodinger representatives <q’ |p’) 
are given by (47) with the coefficients c’ suitably chosen. The phase 
factors of these basic kets must be chosen so as to make (51) hold. 
The easiest way to bring in this condition is to use the symmetry 
between q and p referred to above, according to which <q’ \p’> must 
go over into (p’|q’> if we interchange gq’ and p’ and write —1 for i. 
Now <q’ |p’> is equal to the right-hand side of (47) and <p’|q’> to the 
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conjugate complex expression, and hence c’ must be independent of 
p’. Thus c’ is just a number c. Further, we must have 


<p'|p"> = 8(p'—p"), 
which shows, on comparison with (50), that |c| = -?. We can choose 


the arbitrary constant phase factor in either representation so as to 
make c = A-}, and we then get 
qi > hema (52) 
for the transformation function. 
The foregoing work may easily be generalized to a system with 
n degrees of freedom, describable in terms of n q’s and p’s, with the 
eigenvalues of each g running from —oo to 0. Each p will then be 
an observable with eigenvalues running from —oo to oo, and there 
will be symmetry between the set of q’s and the set of p’s, the 
commutation relations remaining invariant if we interchange each q, 
with the corresponding p, and write —7 for 2. A momentum repre- 
sentation can be set up in which the p’s are diagonal and each 


Gu="Titelep,. (53) 
The transformation function connecting it with the Schrédinger 
representation will be given by the product of the transformation 
functions for each degree of freedom separately, as is shown by 
formula (67) of § 20, and will thus be 
£45 Var GnlPi PoP? = GPL? |P2>---<In Pn? 


ee eT tae oe ean Le (54) 


24. Heisenberg’s principle of uncertainty 
For a system with one degree of freedom, the Schrédinger and the 
momentum representatives of a ket |X} are connected by 


<p’ |X> = h-t J e-i¢p'lh dq’ CAP.OF 
i (55) 
<q’ |X> => h-* | eff p'lh dp’ <p'|X>. 


These formulas have an elementary significance. They show that 
either of the representatives is given, apart from numerical coefficients, 
by the amplitudes of the Fourier components of the other. 

It is interesting to apply (55) to a ket whose Schrédinger repre- 
sentative consists of what is called a wave packet. This is a function 
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whose value is very small everywhere outside a certain domain, of 
width Aq’ say, and inside this domain is approximately periodic with 
a definite frequency.+ If a Fourier analysis is made of such a wave 
packet, the amplitude of all the Fourier components will be small, 
except those in the neighbourhood of the definite frequency. The 
components whose amplitudes are not small will fill up a frequencyt 
band whose width is of the order 1/Agq’, since two components whose 
frequencies differ by this amount, if in phase in the middle of the 
domain Ag’, will be just out of phase and interfering at the ends of 
this domain. Now in the first of equations (55) the variable 
(27)-*p'/h = p'/h plays the part of frequency. Thus with <q’ |X) of the 
form of a wave packet, the function <p’|X), being composed of the 
amplitudes of the Fourier components of the wave packet, will be 
small everywhere in the p’-space outside a certain domain of width 
Ap = hig’: 

Let us now apply the physical interpretation of the square of the 
modulus of the representative of a ket as a probability. We find that 
our wave packet represents a state for which a measurement of q is 
almost certain to lead to a result lying in a domain of width Aq’ and 
a measurement of p is almost certain to lead to a result lying in a 
domain of width Ap’. We may say that for this state q has a definite 
value with an error of order Aq’ and p has a definite value with an 
error of order Ap’. The product of these two errors is 

Aq’Ap' = h. (56) 
Thus the more accurately one of the variables g,p has a definite 
value, the less accurately the other has a definite value. Fora system 
with several degrees of freedom, equation (56) applies to each degree 
of freedom separately. 

Equation (56) is known as Heisenberg’s Principle of Uncertdinty. 
It shows clearly the limitations in the possibility of simultaneously 
assigning numerical values, for any particular state, to two non- 
commuting observables, when those observables are a canonical co- 
ordinate and momentum, and provides a plain illustration of how 
observations in quantum mechanics may be incompatible. It also 
_ shows how classical mechanics, which assumes that numerical values 
can be assigned simultaneously to all observables, may be a valid 
approximation when hf can be considered as small enough to be 


} Frequency here means reciprocal of wave-length. 
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negligible. Equation (56) holds only in the most favourable case, 
which occurs when the representative of the state is of the form of a 
wave packet. Other forms of representative would lead to a Aq’ and 
Ap’ whose product is larger than h. 

Heisenberg’s principle of uncertainty shows that, in the limit when 
either g or p is completely determined, the other is completely 
undetermined. This result can also be obtained directly from the 
transformation function <q'|p’>. According to the end of § 18, 
1<q'|p’>|? dq’ is proportional to the probability of g having a value in 
the small range from q’ to q’-+dq’ for the state for which p certainly 
has the value p’, and from (52) this probability is independent of q’ 
for a given dq’. Thus if p certainly has a definite value p’, all values 
of g are equally probable. Similarly, if g certainly has a definite value 
q’, all values of p are equally probable. 

It is evident physically that a state for which all values of ¢ are 
equally probable, or one for which all values of p are equally probable, 
cannot be attained in practice, in the first case because of limitations 
of size and in the second because of limitations of energy. Thus an 
eigenstate of p or an eigenstate of g cannot be attained in practice. 
The argument at the end of § 12 already showed that such eigenstates 
are unattainable, because of the infinite precision that would be 
needed to set them up, and we now have another argument leading 
to the same conclusion. 


25. Displacement operators 

We get a new insight into the meaning of some of the quantum con- 
ditions by making a study of displacement operators. These appear 
in the theory when we take into consideration that the scheme of 
relations between states and dynamical variables given in Chapter IT 
is essentially a physical scheme, so that if certain states and dynamical 
variables are connected by some relation, on our displacing them all 
in a definite way (for example, displacing them all through a distance 
Sa in the direction of the x-axis of Cartesian coordinates), the new 
states and dynamical variables would have to be connected by the 
same relation. 

The displacement of a state or observable is a perfectly definite 
process physically. Thus to displace a state or observable through a 
distance dx in the direction of the z-axis, we should merely have to 
displace all the apparatus used in preparing the state, or all the 
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apparatus required to measure the observable, through the distance 
dz in the direction of the z-axis, and the displaced apparatus would 
define the displaced state or observable. The displacement of a 
dynamical variable must be just as definite as the displacement of 
an observable, because of the close mathematical connexion between 
dynamical variables and observables. A displaced state or dynamical 
variable is uniquely determined by the undisplaced state or dynami- 
cal variable together with the direction and magnitude of the dis- 
placement. 

The displacement of a ket vector is not such a definite thing though. 
If we take a certain ket vector, it will represent a certain state and we 
may displace this state and get a perfectly definite new state, but this 
new state will not determine our displaced ket, but only the direction 
of our displaced ket. We help to fix our displaced ket by requiring 
that it shall have the same length as the undisplaced ket, but even 
then it is not completely determined, but can still be multiplied by 
an arbitrary phase factor. One would think at first sight that each 
ket one displaces would have a different arbitrary phase factor, 
but with the help of the following argument, we see that it must be 
the same for them all. We make use of the law that superposition 
relationships between states remain invariant under the displace- 
ment. A superposition relationship between states is expressed 
mathematically by a linear equation between the kets corresponding 
to those states, for example 


|R> = ¢,|A>+¢,|B), (57) 
where c, and c, are numbers, and the invariance of the superposition 
relationship requires that the displaced states correspond to kets 
with the same linear equation between them—in our example they 
would correspond to | Rd), |Ad>, | Bd> say, satisfying . 


|Rd> = ¢|Ad)+c,| Bd). (58) 


We take these kets to be our displaced kets, rather than these kets 
multiplied by arbitrary independent phase factors, which latter 
kets would satisfy a linear equation with different coefficients C4,.C9- 
The only arbitrariness now left in the displaced kets is that of a single 
arbitrary phase factor to be multiplied into all of them. 

The condition that linear equations between the kets remain in- 
variant under the displacement and that an equation such as (58) 
holds whenever the corresponding (57) holds, means that the dis- 
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placed kets are linear functions of the undisplaced kets and thus each 
displaced ket | Pd) is the result of some linear operator applied to the 
corresponding undisplaced ket |P>. In symbols, 
ia ==eD''P>, (59) 

where D is a linear operator independent of |P> and depending only 
on the displacement. The arbitrary phase factor by which all the 
displaced kets may be multiplied results in D being undetermined 
to the extent of an arbitrary numerical factor of modulus unity. 

With the displacement of kets made definite in the above manner 
and the displacement of bras, of course, made equally definite, 
through their being the conjugate imaginaries of the kets, we can 
now assert that any symbolic equation between kets, bras, and 
dynamical variables must remain invariant under the displacement 
of every symbol occurring in it, on account of such an equation 
having some physical significance which will not get changed by the 
displacement. 

Take as an example the equation 


<Q|P> = ¢, 
c being a number. Then we must have 
<Qd|Pd> = ¢ = <Q|P>. (60) 
From the conjugate imaginary of (59) with @ instead of P, 
<Qd| = <Q|D. (61) 
Hence (60) gives <Q|IDD|P> = <Q|P>. 
Since this holds for arbitrary <Q| and |P>, we must have 
DD =, (62) 


giving us a general condition which D has to satisfy. 
Take as a second example the equation 


v|P> = |B), 
where v is any dynamical variable. Then, using vy to denote the 
displaced dynamical variable, we must have 

vg|Pd> = |Rd>. 
With the help of (59) we get 
vg\Pay = D\R>Y = Deo|P> = DvD-"|Pda). 

Since |Pd> can be any ket, we must have 

vp Dan, (63) 


3595-57 H 
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which shows that the linear operator D determines the displacement 
of dynamical variables as well as that of kets and bras. Note that 
the arbitrary numerical factor of modulus unity in D does not affect 
vq, and also it does not affect the validity of (62). . 

Let us now pass to an infinitesimal displacement, i.e. taking the 
displacement through the distance Sx in the direction of the x-axis, 
let us make 4-0. From physical continuity we should expect 
a displaced ket |Pd> to tend to the original [> and we may further 
expect the limit 


tim [P4>— 12> = lim D=1 |P> 
ae ba Sa—>0 OX 
to exist. This requires that the limit 
lim (D—1)/8x (64) 
20 


shall exist. This limit is a linear operator which we shall call the 
displacement operator for the x-direction and denote by d,. The 
arbitrary numerical factor e’Y with y real which we may multiply 
into D inust be made to tend to unity as 6x + 0 and then introduces 
an arbitrariness in d,, namely, d, may be replaced by 


lim (De*”—1)/da = lim (D—1+ty)/8% = d,,+1a,, 
$20 d2—>0 
where a, is the limit of y/8z. Thus d, contains an arbitrary additive 


pure imaginary number. 
For 6x small D = 1+62d,. (65) 


Substituting this into (62), we get 
(1+8xd,)(1-+8ed,) = 1, 
which reduces, with neglect of 52?, to 
S2(d,,+-d,) = 0. 
Thus d, is a pure imaginary linear operator. Substituting (65) into 
(63) we get, with neglect of d2? again, 


Vq = (1+62d,)v(1—dxd,) = v+8a(d,v—v d,), (66) 
showing that aa (vg—v)/ba = d,v—vd,. (67) 
c—>0 


We may describe any dynamical system in terms of the following 
dynamical variables: the Cartesian coordinates 2, y, 2 of the centre of 
mass of the system, the components p,, p,, p, of the total momentum 
of the system, which are the canonical momenta conjugate to 2, y,z 
respectively, and any dynamical variables needed for describing 


§ 25 DISPLACEMENT OPERATORS 103 


internal degrees of freedom of the system. If we suppose a piece 
of apparatus which has been set up to measure x, to be displaced a 
distance dz in the direction of the x-axis, it will measure x—dx, hence 
Lq = x—Se. 

Comparing this with (66) for v = x, we obtain 

d,x—xd, = —1. (68) 
This is the quantum condition connecting d,, with x. From similar 
arguments we find that y, 2, p,, p,, p, and the internal dynamical vari- 
ables, which are unaffected by the displacement, must commute with 
d,. Comparing these results with (9), we see that i#d, satisfies just 
the same quantum conditions as p,. Their difference, p,—ihd,, 
commutes with all the dynamical variables and must therefore be a 
number. This number, which is necessarily real since p, and ifd, are 
both real, may be made zero by a suitable choice of the arbitrary, 
pure imaginary number that can be added to d,. We then have the 


= Pz = itd, (69) 


or the x-component of the total momentum of the system is ih times the 
displacement operator d,. 

This is a fundamental result, which gives a new significance to 
displacement operators. There is a corresponding result, of course, 
also for the y and z displacement operators d, and d,. The quantum 
conditions which state that p,, p, and p, commute with each other 
are now seen to be connected with the fact that displacements in 
different directions are commutable operations. 


26. Unitary transformations 
Let U be any linear operator that has a reciprocal U-1 and con- 
sider the equation sie. (70) 


« being an arbitrary linear operator. This equation may be regarded 
as expressing a transformation from any linear operator « to a 
corresponding linear operator a*, and as such it has rather remarkable 
properties. In the first place it should be noted that each a* has the 
same eigenvalues as the corresponding «; since, if a’ is any eigenvalue 
of ~ and |a’> is an eigenket belonging to it, we have 
ala’> = a’ jo’ 
and hence 
a*U\o’> = UaUU |a’>) = Uala’> = a'U |x’, 
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showing that U|a’> is an eigenket of a* belonging to the same eigen- 
value «’, and similarly any eigenvalue of a* may be shown to be also 
an eigenvalue of «. Further, if we take several «’s that are connected 
by algebraic equations and transform them all according to (70), the 
corresponding «*’s will be connected by the same algebraic equations. 
This result follows from the fact that the fundamental algebraic pro- 
cesses of addition and multiplication are left invariant by the trans- 
formation (70), as is shown by the following equations: 


(x, -+a)* = U(a,tag)U-! = Ua,U-!+4+ Ua U-! = af taf, 


(a, G5)" => Ua, Xs U-1 = Ua, Uli U-1 = aoe. 


let us now see what condition would be imposed on U by the 
requirement that any real « transforms into a real a*. Equation 
(70) may be written Gh Tv. (71) 
Taking the conjugate complex of both sides in accordance with 
(5) of § 8 we find, if « and «* are both real, 
Ua* = al, (72) 
Equation (71) givesus OatU = UUa 
and equation (72) gives us 
Ua*U —— aJU. 
Hence O0Ua = «UU. 
Thus UU commutes with any real linear operator and therefore also 
with any linear operator whatever, since any linear operator can be 
expressed as one real one plus 7 times another. Hence UU isa 
number. It is obviously real, its conjugate complex according to (5) 
of § 8 being the same as itself, and further it must be a positive 
number, since for any ket |P>, <P|UU|P) is positive as well as 
<P|P>. We can suppose it to be unity without any loss of generality 
in the transformation (70). We then have 
OU = 1. (73) 
Equation (73) is equivalent to any of the following 
U204, U=04, . Ua =a, (74) 
A matrix or linear operator U that satisfies (73) and (74) is said 
to be unitary and a transformation (70) with unitary U is called a 


unitary transformation. A unitary transformation transforms real 
linear operators into real linear operators and leaves invariant any 
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algebraic equation between linear operators. It may be considered 
as applying also to kets and bras, in accordance with the equations 

a AP aU = <P, (75) 
and then it leaves invariant any algebraic equation between linear 
operators, kets, and bras. It transforms eigenvectors of « into eigen- 
vectors of a*. From this one can easily deduce that it transforms an 
observable into an observable and that it leaves invariant any func- 
tional relation between observables based on the general definition 
of a function given in § 11. 

The inverse of a unitary transformation is also a unitary trans- 
formation, since from (74), if U is unitary, U-1 is also unitary. 
Further, if two unitary transformations are applied in succession, 
the result is a third unitary transformation, as may be verified in 
the following way. Let the two unitary transformations be (70) and 

oe = Veer. 
The connexion between af and « is then 
i Vitaliv 
= (Vea U)-* (76) 
from (42) of §11. Now VU is unitary since 
VovU =0VVU = UU =1, 
and hence (76) is a unitary transformation. 

The transformation given in the preceding section from undisplaced 
to displaced quantities is an example of a unitary transformation, as 
is shown by equations (62), (63), corresponding to equations (73), 
(70), and equations (59), (61), corresponding to equations (75). 

In classical mechanics one can make a transformation from the 
canonical coordinates and momenta q,, p, (7 = 1,..,”) to a new set of 
variables q*, p* (r = 1...,n) satisfying the same P.B. relations as the 
q’s and p’s, i.e. equations (8) of § 21 with q*’s and p*’s replacing the 
q’sand p’s, and can express all dynamical variables in terms of the q*’s 
and p*’s. The g*’s and p*’s are then also called canonical coordinates 
and momenta and the transformation is called a contact transforma- 
tion. One can easily verify that the P.B. of any two dynamical 
variables u and vis correctly given by formula (1) of $21 with q*’sand 
p*’s instead of q’s and p’s, so that the P.B. relationship is invariant 
under a contact transformation. This results in the new canonical 
coordinates and momenta being on the same footing as the original 
ones for many purposes of general dynamical theory, even though the 
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new coordinates g¥ may not be a set of Lagrangian coordinates but 
may be functions of the Lagrangian coordinates and velocities. 

It will now be shown that, for a quantum dynamical system that 
has a classical analogue, unitary transformations in the quantum theory 
are the analogue of contact transformations in the classical theory. 
Unitary transformations are more general than contact transforma- 
tions, since the former can be applied to systems in quantum 
mechanics that have no classical analogue, but for those systems in 
quantum mechanics which are describable in terms of canonical 
coordinates and momenta, the analogy between the two kinds of 
transformation holds. To establish it, we note that a unitary trans- 
formation applied to the quantum variables q,, p, gives new variables 
7, pr satisfying the same P.B. relations, since the P.B. relations are 
equivalent to the algebraic relations (9) of § 21 and algebraic relations 
are left invariant by a unitary transformation. Conversely, any real 
variables q¥*, p¥ satisfying the P.B. relations for canonical coordinates 
and momenta are connected with the q,, p, by a unitary transforma- 
tion, as is shown by the following argument. 

We use the Schrédinger representation, and write the basic ket 
IQ1-+-Gn> as |q’> for brevity. Since we are assuming that the q*, p* 
satisfy the P.B. relations for canonical coordinates and momenta, 
we can set up a Schrédinger representation referring to them, with 
the g* diagonal and each p* equal to —ih 0/q7. The basic kets in 
this second Schrédinger representation will be lax’...q*">, which we 
write |q*’> for brevity. Now introduce the linear operator U defined by 


<q*'|U|q’> = 8(q*’"—q’), (77) 
where 8(q*’—q’) is short for 
8(g*’—9') = 8(9F’ —44)3(az’ —92)...8(g*'—4),). (78) 
The conjugate complex of (77) is : 


<q'|U|q*"> = 3(q*’—q'), 
and hencet 
<7/|TU Ia") = [ <a IT Ig*> dq*’ <g*|Ulq") 
= | 8(g*’—9’) dq*’ 3(q*"—q") 


= 6(q’—q"), 
so that UU = 1. 


+t We use the notation of a single integral sign and dg*’ to denote an integral over 
all the variables g}’, q3’,...,q*’.. This abbreviation will be used also in future work. 
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Thus U is a unitary operator. We have further 
<q gr U |g’) = aF'8(q*"' —9’) 
and <q*'|Ug,la> = 8(q*’—9')4,- 
The right-hand sides of these two equations are equal on account of 
the property of the 5 function (11) of § 15, and hence 


qrU = Ug, 
or q 2g. 
Again, from (45) and (46), 


, , ° é rg , 
<q*' |ptU |q’> = —ih = 89" -7), 


ogr" 
‘ Ul 5 Q te , 
<q*'|Up,|q'> = tha 8g" —9’). 

qr 
The right-hand sides of these two equations are obviously equal, and 
hence ptU = Up, 
or pa Uy U-. 
Thus all the conditions for a unitary transformation are verified. 


We get an infinitesimal unitary transformation by taking U in (70) 
to differ by an infinitesimal from unity. Put 


U = 1+ieF, 
where ¢ is infinitesimal, so that its square can be neglected. Then 
U-1 = 1—teF. 


‘The unitary condition (73) or (74) requires that F shall be real. The 
transformation equation (70) now takes the form 


a* = (1+teF)a(1—teF), 


which gives at—a = te(Fa—oF). (79) 
It may be written in P.B. notation 
at¥—a = ehila, F}. (80) 


If «is a canonical coordinate or momentum, this is formally the same 
as a classical infinitesimal contact transformation. 


Vv 
THE EQUATIONS OF MOTION 


27. Schrédinger’s form for the equations of motion 

Our work from § 5 onwards has all been concerned with one instant 
of time. It gave the general scheme of relations between states and 
dynamical variables for a dynamical system at one instant of time. 
To get a complete theory of dynamics we must consider also the 
connexion between different instants of time. When one makes an 
observation on the dynamical system, the state of the system gets 
changed in an unpredictable way, but in between observations 
causality applies, in quantum mechanics as in classical mechanics, 
and the system is governed by equations of motion which make the 
state at one time determine the state at a later time. These equations 
of motion we now proceed to study. They will apply so long as the 
dynamical system is left, undisturbed by any observation or similar 
process.f Their general form can be deduced from the principle of 
superposition of Chapter I. 

Let us consider a particular state of motion throughout the time 
during which the system is left undisturbed. We shall have the state 
at any time ¢ corresponding to a certain ket which depends on ¢ and 
which may be written |f>. If we deal with several of these states of 
motion we distinguish them by giving them labels such as A, and we 
then write the ket which corresponds to the state at time ¢ for one 
of them |Af>. The requirement that the state at one time determines 
the state at another time means that |At,> determines |At) except 
for a numerical factor. The principle of superposition applies to these 
states of motion throughout the time during which the system is 
undisturbed, and means that if we take a superposition relation 
holding for certain states at time ft) and giving rise to a linear equation 
between the corresponding kets, e.g. the equation 


| Rig> = c,|Aty>+c9| Bt», 
the same superposition relation must hold between the states of 


motion throughout the time during which the system is undisturbed 
and must lead to the same equation between the kets corresponding 
} The preparation of a state is a process of this kind. It often takes the form of 


making an observation and selecting the system when the result of the observation 
turns out to be a certain pre-assigned number. 
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to these states at any time ¢ (in the undisturbed time interval), i.e. 
the equation |Rt) = ¢,|At)-+¢9| BE), 
provided the arbitrary numerical factors by which these kets may be 
multiplied are suitably chosen. It follows that the |Pt)>’s are linear 
functions of the |Pt,.>’s and each |Pt> is the result of some linear 
operator applied to |Pt,>. In symbols 

|[Pt) = T| Pt), (1) 
where 7’ is a linear operator independent of P and depending only 
on ¢ (and f,). 

We now assume that each |Pt> has the same length as the corre- 
sponding |Pé)>. It is not necessarily possible to choose the arbitrary 
numerical factors by which the |Pt>’s may be multiplied so as to © 
make this so without destroying the linear dependence of the | Pt>’s 
on the |Pt)>’s, so the new assumption is a physical one and not just 
a question of notation. It involves a kind of sharpening of the 
principle of superposition. The arbitrariness in |P¢) now becomes 
merely a phase factor, which must be independent of P in order that 
the linear dependence of the |Pt>’s on the |Pt,>’s may be preserved. 
From the condition that the length of c,|Pt>+-c,|Qt> equals that of 
¢,|Pto>+c,|Qto> for any complex numbers ¢,, c,, we can deduce that 

<Qt| Pt» oe (Qto| Plo». (2) 

The connexion between the |Pt>’s and |Pt)>’s is formally similar 
to the connexion we had in § 25 between the displaced and undisplaced 
kets, with a process of time displacement instead of the space displace- 
ment of § 25. Equations (1) and (2) play the part of equations (59) 
and (60) of § 25. We can develop the consequences of these equations 
as in § 25 and can deduce that 7 contains an arbitrary numerical 
factor of modulus unity and satisfies 

CTL, (3) 
corresponding to (62) of § 25, so 7's wnitary. We pass to the infinitesi- 
mal case by making t > f, and assume from physical continuity that 
the limit lan re — | Pty> 

tt, t—ly 
exists. This limit is just the derivative of |Pt,> with respect to fp. 
From (1) it equals 


d\Plyy _ | T—1 


lim 


|p. (4) 
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The limit operator occurring here is, like (64) of § 25, a pure imaginary 
linear operator and is undetermined to the extent of an arbitrary 
additive pure imaginary number. Putting this limit operator multi- 
plied by i% equal to H, or rather H(t) since it may depend on fp, 
equation (4) becomes, when written for a general ¢, 


SOL oy 
in = HW)|Pt). (5) 


Equation (5) gives the general law for the variation with time of 
the ket corresponding to the state at any time. It is Schrédinger’s 
form for the equations of motion. It involves just one real linear 
operator H(t), which must be characteristic of the dynamical system 
under consideration. We assume that H(t) is the total energy of 
the system. There are two justifications for this assumption, (i) the 
analogy with classical mechanics, which will be developed in the 
next section, and (ii) we have H(t) appearing as 7# times an operator 
of displacement in time similar to the operators of displacement in 
the x, y, and z directions of § 25, so corresponding to (69) of § 25 
we should have H(t) equal to the total energy, since the theory of 
relativity puts energy in the same relation to time as momentum to 
distance. 

We assume on physical grounds that the total energy of a system 
is always an observable. For an isolated system it is a constant, and 
may then be written H. Even when it is not a constant we shall often 
write it simply H, leaving its dependence on ¢ understood. If the 
energy depends on ¢, it means the system is acted on by external 
forces. An action of this kind is to be distinguished from a distur- 
bance caused by a process of observation, as the former is compatible 
with causality and equations of motion while the latter is not. 

We can get a connexion between H(t) and the 7 of equation (1) 
by substituting for |Pt> in (5) its value given by equation (1). This 
gives a 
ae |\Plo> =) 2 Pt5>- 

Since |Pt,.> may be any ket, we have 
OT 
i an AD: (6) 


Equation (5) is very important for practical problems, where it is 
usually used in conjunction with a representation. Introducing a 
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representation with a complete set of commuting observables 
diagonal and putting <é’| Pt> equal to 4(é’t), we have, passing to the 
standard ket notation, 


|Pt> = #(€t)>. 
Equation (5) now becomes 
ih gt)> = HY(et)>. (7) 


Equation (7) is known as Schrédinger’s wave equation and its solutions 
(Et) are time-dependent wave functions. Each solution corresponds to 
a state of motion of the system and the square of its modulus gives 
the probability of the é’s having specified values at any time t. For 
a system describable in terms of canonical coordinates and momenta 
we may use Schrédinger’s representation and can then take H to be 
an operator of differentiation in accordance with (42) of § 22. 


28. Heisenberg’s form for the equations of motion 

In the preceding section we set up a picture of the states of 
undisturbed motion by making each of them correspond to a moving 
ket, the state at any time corresponding to the ket at that time. We 
shall call this the Schrédinger picture. Let us apply to our kets the 
unitary transformation which makes each ket |a> go over into 

— |a®) = Tay. (8) 
This transformation is of the form given by (75) of § 26 with 7'-? for 
U, but it depends on the time t since T depends on ¢. It is thus to be 
pictured as the application of a continuous motion (consisting of 
rotations and uniform deformations) to the whole ket vector space. 
A ket which is originally fixed becomes a moving one, its motion being 
given by (8) with |a> independent of ¢. On the other hand, a ket 
which is originally moving to correspond to a state of undisturbed 
motion, i.e. in accordance with equation (1), becomes fixed, since on 
substituting |Pt> for |a> in (8) we get |a*> independent of t. Thus 
the transformation brings the kets corresponding to states of undisturbed 
motion to rest. 

The unitary transformation must be applied also to bras and linear 
operators, in order that equations between the various quantities may 
remain invariant. The transformation applied to bras is given by the 
conjugate imaginary of (8) and applied to linear operators it is given 
by (70) of § 26 with 7? for U, i.e. 

oe Tat. (9) 
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A linear operator which is originally fixed transforms into a moving 
linear operator in general. Now a dynamical variable corresponds to 
a linear operator which is originally fixed (because it does not refer 
to t at all), so after the transformation it corresponds to a moving 
linear operator. The transformation thus leads us to a new picture 
of the motion, in which the states correspond to fixed vectors and 
the dynamical variables to moving linear operators. We shall call 
this the Heisenberg picture. 

The physical condition of the dynamical system at any time 
involves the relation of the dynamical variables to the state, and 
the change of the physical condition with time may be ascribed 
either to a change in the state, with the dynamical variables kept 
fixed, which gives us the Schrédinger picture, or to a change in the 
dynamical variables, with the state kept fixed, which ae us the 
Heisenberg picture. . 

In the Heisenberg picture there are equations of motion for the 
dynamical variables. Take a dynamical variable corresponding to 
the fixed linear operator v in the Schrédinger picture. In the Heisen- 
berg picture it corresponds to a moving linear operator, which we 
write as v, instead of v*, to bring out its dependence on ¢, and which 


is given by = TT (10) 
or Le wl. 
Differentiating with respect to t, we get 


dT dy, aT 
gers =v 


‘dt’ 
With the help of (6), this gives 
AT yin = HT : 
. du, ye 
or i — WHT TB iy, 
= v,H,—H,%, : ; (11) 
where fe TOL? (12) 


Equation (11) may be written in P.B. notation 


ot — [Hi (13) 
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Equation (11) or (13) shows how any dynamical variable varies 
with time in the Heisenberg picture and gives us Heisenberg’s form 
for the equations of motion. These equations of motion are determined 
by the one linear operator H,, which is just the transform of the linear 
operator H occurring in Schrédinger’s form for the equations of 
motion and corresponds to the energy in the Heisenberg picture. We 
shall call the dynamical variables in the Heisenberg picture, where 
they vary with the time, Heisenberg dynamical variables, to distinguish 
them from the fixed dynamical variables of the Schrodinger picture, 
which we shall call Schrodinger dynamical variables. Each Heisenberg 
dynamical variable is connected with the corresponding Schrédinger 
dynamical] variable by equation (10). Since this connexion isa unitary 
transformation, all algebraic and functional relationships are the 
same for both kinds of dynamical variable. We have 7 = 1 for 
t = to, so that v,, = v and any Heisenberg dynamical variable at time 
tp equals the corresponding Schrédinger dynamical variable. 

Equation (13) can be compared with classical mechanics, where we 
also have dynamical variables varying with the time. The equations 
of motion of classical mechanics can be written in the Hamiltonian 


form dq, cH dp, _ oH iti 
dt ap,’ a a,’ 


where the q’s and p’s are a set of canonical coordinates and momenta 
and H is the energy expressed as a function of them and possibly also 
of t. The energy expressed in this way is called the Hamiltonian. 
Equations (14) give, for v any function of the q’s and p’s that does 
not contain the time ¢ explicitly, 

dv ov dq, , &v dp, 

a~2 i ai | &, =| 


aye?.22 
— £04, op, op, %, 


= [v, H], (15) 


with the classical definition of a P.B., equation (1) of § 21. This is 
of the same form as equation (13) in the quantum theory. We thus 
get an analogy between the classical equations of motion in the 
Hamiltonian form and the quantum equations of motion in Heisen- 
berg’s form. This analogy provides a justification for the assumption 


re 
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that the linear operator H introduced in the preceding section is the 
energy of the system in quantum mechanics. 

In classical mechanics a dynamical system is defined mathemati- 
cally when the Hamiltonian is given, i.e. when the energy is given 
in terms of a set of canonical coordinates and momenta, as this is 
sufficient to fix the equations of motion. In quantum mechanics a 
dynamical system is defined mathematically when the energy is 
given in terms of dynamical variables whose commutation relations 
are known, as this is then sufficient to fix the equations of motion, 
in both Schrédinger’s and Heisenberg’s form. We need to have 
either H expressed in terms of the Schrédinger dynamical variables 
or H, expressed in terms of the corresponding Heisenberg dynamical 
variables, the functional relationship being, of course, the same in 
both cases. We call the energy expressed in this way the Hamiltonian 
of the dynamical system in quantum mechanics, to keep up the 
analogy with the classical theory. 

A system in quantum mechanics always has a Hamiltonian, whether 
the system is one that has a classical analogue and is describable in 
terms of canonical coordinates and momenta or not. However, if the 
system does have a classical analogue, its connexion with classical 
mechanics is specially close and one can usually ‘assume that the 
Hamiltonian is the same function of the canonical coordinates and 
momenta in the quantum theory as in the classical theory.t There 
would be a difficulty in this, of course, if the classical Hamiltonian 
involved a product of factors whose quantum analogues do not com- 
mute, as one would not know in which order to put these factors in 
the quantum Hamiltonian, but this does not happen for most of the 
elementary dynamical systems whose study is important for atomic 
physics. In consequence we are able also largely to use the same 
language for describing dynamical systems in the quantum theory as 
in the classical theory (e.g. to talk about particles with given masses 
moving through given fields of force), and when given a system in 
classical mechanics, can usually give a meaning to ‘the same’ system 
in quantum mechanics. ; 

Equation (13) holds for v, any function of the Heisenberg dynamical 
variables not involving the time explicitly, i.e. for v any constant 


} This assumption is found in practice to be successful only when applied with the 
dynamical coordinates and momenta referring to a Cartesian system of axes and not 
to more general curvilinear coordinates. 
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linear operator in the Schrédinger picture. It shows that such a 
function v, is constant if it commutes with H, or if v commutes with H. 
We then have 
2, =Yg= v, 

and we call v, or v a constant of the motion. It is necessary that v shall 
commute with H at all times, which is usually possible only if H is 
constant. In this case we can substitute H for v in (13) and deduce 
that H, is constant, showing that H itself is then a constant of the 
motion. Thus if the Hamiltonian is constant in the Schrédinger 
picture, it is also constant in the Heisenberg picture. 

For an isolated system, a system not acted on by any external 
forces, there are always certain constants of the motion. One of these 
is the total energy or Hamiltonian. Others are provided by the 
displacement theory of § 25. It is evident physically that the total 
energy must remain unchanged if all the dynamical variables are 
displaced in a certain way, so equation (63) of § 25 must hold with 
vg =v=H. Thus D commutes with H and is a constant of the 
motion. Passing to the case of an infinitesimal displacement, we see 
that the displacement operators d,, d,, and d, are constants of the 
motion and hence, from (69) of § 25, the total momentum is a constant 
of the motion. Again, the total energy must remain unchanged if all 
the dynamical variables are subjected to a certain rotation. This 
leads, as will be shown in § 35, to the result that the total angular 
momentum is a constant of the motion. The laws of conservation of 
energy, momentum, and angular momentum hold for an isolated system 
in the Heisenberg picture in quantum mechanics, as they hold in 
classical mechanics. 

Two forms for the equations of motion of quantum mechanics have 
now been given. Of these, the Schrédinger form is the more useful 
one for practical problems, as it provides the simpler equations. The 
unknowns in Schrédinger’s wave equation are the numbers which 
form the representative of a ket vector, while Heisenberg’s equation 
of motion for a dynamical variable, if expressed in terms of a repre- 
sentation, would involve as unknowns the numbers forming the 
representative of the dynamical variable. The latter are far more 
numerous and therefore more difficult to evaluate than the Schro- 
dinger unknowns. Heisenberg’s form for the equations of motion is 
of value in providing an immediate analogy with classical mechanics 
and enabling one to see how various features of classical theory, such 
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as the conservation laws referred to above, are translated into quan- 
tum theory. 


29. Stationary states 
_ We shall here deal with a dynamical system whose energy is con- 
stant. Certain specially simple relations hold for this case. Equation 
(6) can be integratedy to give 
T = e-tHttoJh, 


with the help of the initial condition that 7 = 1 for t = ty. This 
result substituted into (1) gives 

pee On Pie (16) 
which is the integral of Schrédinger’s equation of motion (5), and 
substituted into (10) it gives 

vy = CHU -tofiyg-tHU—-10)h (17) 


which is the integral of Heisenberg’s equation of motion (11), H, being 
now equal to H. Thus we have solutions of the equations of motion 
in a simple form. However, these solutions are not of much practical 
value, because of the difficulty involved in evaluating the operator 
eH, unless H is particularly simple, and for practical purposes 
one usually has to fall back on Schrédinger’s wave equation. _ 

Let us consider a state of motion such that at time ft, it is an eigen- 
state of the energy. The ket |Pt)> corresponding to it at this time 
must be an eigenket of H. If H’ is the eigenvalue to which it belongs, 


equation (16) gives [Pt) = e-tH't-wih Py», 


showing that |Pt> differs from |Pt)> only by a phase factor. Thus 
the state always remains an eigenstate of the energy, and further, it 
does not vary with the time at all, since the direction of the ket | Pt) 
does not vary with the time. Such a state is called a stationary state. 
The probability for any particular result of an observation on it is 
independent of the time when the observation is made. From our 
assumption that the energy is an observable, there are sufficient 
stationary states for an arbitrary state to be dependent on them. 

The time-dependent wave function 4(&) representing a stationary 
state of energy H’ will vary with time according to the law 


PEt) = Yo(E)e te, (18) 


t+ The integration can be carried out as though H were an ordinary algebraic © 
variable instead of a linear operator, because there is no quantity that does not 
commute with H in the work. 
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and Schrédinger’s wave equation (7) for it reduces to 


H'}.> = Hy). (19) 
This equation merely asserts that the state represented by yy is an 
eigenstate of H. We call a function yy satisfying (19) an eigenfunction 
of H, belonging to the eigenvalue H’. 

In the Heisenberg picture the stationary states correspond to lied 
eigenvectors of the energy. We can set up a representation in which 
all the basic vectors are eigenvectors of the energy and so correspond 
to stationary states in the Heisenberg picture. We call such a repre- 
sentation a Heisenberg representation. The first form of quantum 
mechanics, discovered by Heisenberg in 1925, was in terms of a 
representation of this kind. The energy is diagonal in the representa- 
tion. Any other diagonal dynamical variable must commute with the 
energy and is therefore a constant of the motion. The problem of 
setting up a Heisenberg representation thus reduces to the problem 
of finding a complete set of commuting observables, each of which 
is a constant of the motion, and then making these observables 
diagonal. The energy must be a function of these observables, from 
Theorem 2 of § 19. It is sometimes convenient to take the energy 
itself as one of them. 

Let « denote the complete set of commuting observables in a 
Heisenberg representation, so that the basic vectors are written <«’ |, 
|x”>. The energy is a function of these observables a, say H = H(q). 
From (17) we get 


Kan! Jaye” == Cox! let tC tallies 0h |" 
= efHl’-HM-H0R Wyler”), (20) 


where H’ = H(«’) and H” = H(a”). The factor <a’ |v|«”> on the right- 
hand side here is independent of t, being an element of the matrix 
representing the fixed linear operator v. Formula (20) shows how the 
Heisenberg matrix elements of any Heisenberg dynamical variable 
vary with time, and it makes v, satisfy the equation of motion (11), 
as is easily verified. The variation given by (20) is simply periodic 
with the frequency 

|H’—H"|/20h = |H’—H" |/h, (21) 
depending only on the energy difference of the two stationary states 
to which the matrix element refers. This result is closely connected 


with the Combination Law of py and Bohr’s Frequency 
3595-57 
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Condition, according to which (21) is the frequency of the electro- 
magnetic radiation emitted or absorbed when the system makes a 
transition under the influence of radiation between the stationary 
states «’ and a”, the eigenvalues of H being Bohr’s energy levels. 
These matters will be dealt with in § 45. 


30. The free particle 

The most fundamental and elementary application of quantum 
mechanics is to the system consisting merely of a free particle, or 
particle not acted on by any forces. For dealing with it we use as 
dynamical variables the three Cartesian coordinates x, y, z and their 
conjugate momenta p,, p,, p,. The Hamiltonian is equal to the 
kinetic energy of the particle, namely 


= 2 2 2 
HH = Sj (Pet Put Pe) ; (22) 


according to Newtonian mechanics, m being the mass. This formula 
is valid only if the velocity of the particle is small compared with c, 
the velocity of light. For a rapidly moving particle, such as we often 
have to deal with in atomic theory, (22) must be replaced by the 
relativistic formula 

H = ¢(m*c?-+-p+ py Ps). (23) 
For small values of p,, p,, and p, (23) goes over into (22), except for 
the constant term mc? which corresponds to the rest-energy of the 
particle in the theory of relativity and which has no influence on the 
equations of motion. Formulas (22) and (23) can be taken over 
directly into the quantum theory, the square root in (23) being now 
understood as the positive square root defined at the end of § 11. 
The constant term mc? by which (23) differs from (22) for small values 
of p,, py, and p, can still have no physical effects, since the Hamil- 
tonian in the quantum theory, as introduced in § 27, is undefined to 
the extent of an arbitrary additive real constant. 

We shall here work with the more accurate formula (23). We shall 
first solve the Heisenberg equations of motion. From the quantum 
conditions (9) of § 21, p, commutes with p, and p,, and hence, from 
Theorem 1 of § 19 extended to a set of commuting observables, p, 
commutes with any function of p,, p,, and p, and therefore with H. 
It follows that p, is a constant of the motion. Similarly p, and p, are 
constants of the motion. These results are the same as in the classical 
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theory. Again, the equation of motion for a coordinate, x, say, is, 
according to (11), 

ae the 

th dy = th! = x 0(m*c? + p2+-p2-+ p2)!—c(m?c? + p24 p2 + p2)tay. 
The right-hand side here can be evaluated by means of formula 
(31) of § 22 with the roles of coordinates and momenta interchanged, 
so that it reads +5, 

Ge = th of/ep,, (24) 


f now being any function of the p’s. This gives 


, 7) — c*p 
Ty = =— C(m*c?+- p+ p+ ps)? = 2. 
op, H 
. (25) 
—_ — 5 Cpe 
Similarly, = “He » =P. 


The magnitude of the velocity is 
v= (+9 +e) = (p2-+p)+-p2)'/H. (26) 
Equations (25) and (26) are just the same as in the classical theory. 
Let us consider a state that is an eigenstate of the momenta, 
belonging to the eigenvalues p,, p/,, p;. This state must be an eigen- 
state of the Hamiltonian, belonging to the eigenvalue 
B= cme pe +p, t+pey, (27) 
and must therefore be a stationary state. The possible values for H’ 
are all numbers from mc? to oo, as in the classical theory. The wave 
function (xyz) representing this state at any time in Schrédinger’s 
representation must satisfy 


Pesblaye)> = peWaye)> = —in PRY), 


with similar equations for p, and p,. These equations show that 
ys(ayz) is of the form 
(xyz) == aetPrt+Pyy+p;2)/h (28) 
where a is independent of x, y,and z. From (18) we see now that the 
time-dependent wave function ¢(xyzt) is of the form 
p(xyzt) = ay iil lala niaiaebill (29) 
where a, is independent of z, y, z, and ¢. 

The function (29) of x, y, z, and ¢ describes plane waves in space- 
time. We see from this example the suitability of the terms ‘wave 
function’ and ‘wave equation’. The frequency of the waves is 

vy =e (30) 
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their wavelength is 
A= h{(pP +p +p) = h/P’, (31) 


P’ being the length of the vector (p;,,,,p,), and their motion is in 
the direction specified by the vector (p;,,p,,,) with the velocity 


Ay = H'/P’ = 0%’, (32) 


v’ being the velocity of the particle corresponding to the momentum 
(Pi, Py» Pz) @8 given by formula (26). Equations (30), (31), and (32) 
are easily seen to hold in all Lorentz frames of reference, the expres- 
sion on the right-hand side of (29) being, in fact, relativistically 
invariant with p,,p,,p, and H’ as the components of a 4-vector. 
These properties of relativistic invariance led de Broglie, before the 
discovery of quantum mechanics, to postulate the existence of waves 
of the form (29) associated with the motion of any particle. They 
are therefore known as de Broglie waves. 

In the limiting case when the mass m is made to tend to zero, the 
classical velocity of the particle v becomes equal to c and hence, from 
(32), the wave velocity also becomes c. The waves are then like the 
light-waves associated with a photon, with the difference that they 
contain no reference to the polarization and involve a complex ex- 
' ponential instead of sines and cosines. Formulas (30) and (31) are 
' still valid, connecting the frequency of the light-waves with the 
energy of the photon and the wavelength of the light-waves with 
the momentum of the photon. 

For the state represented by (29), the probability of the particle 
being found in any specified small volume when an observation of its 
position is made is independent of where the volume is. This provides 
an example of Heisenberg’s principle of uncertainty, the state being 
one for which the momentum is accurately given and for which, in 
consequence, the position is completely unknown. Such a state is, 
of course, a limiting case which never occurs in practice. The states 
usually met with in practice are those represented by wave packets, 
which may be formed by superposing a number of waves of the type 
(29) belonging to slightly different values of (p,, p,,, p,), a8 discussed 
in § 24. The ordinary formula in hydrodynamics for the velocity of 
such a wave packet, i.e. the growp velocity of the waves, is 


dv 


qa») =) 
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which gives, from (30) and (31) 
aH’ es wuee 
pp gare rP y= 


This is just the velocity of the particle. The wave packet moves in 
the same direction and with the same velocity as the particle moves 
in classical mechanics. 


=v’. (34) 


31. The motion of wave packets 


The result just deduced for a free particle is an example of a general 
principle. For any dynamical system with a classical analogue, a state 
for which the classical description is valid as an approximation is 
represented in quantum mechanics by a wave packet, all the co- 
ordinates and momenta having approximate numerical values, whose 
accuracy is limited by Heisenberg’s principle of uncertainty. Now 
Schrédinger’s wave equation fixes how such a wave packet varies with 
time, so in order that the classical description may remain valid, the 
wave packet should remain a wave packet and should move according 
to the laws of classical dynamics. We shall verify that this is so. 

We take a dynamical system having a classical analogue and let 
its Hamiltonian be H(q,, p,) (r = 1, 2,...,2). The corresponding classi- 
cal dynamical system will have as Hamiltonian ,(q,, p,) say, obtained 
by putting ordinary algebraic variables for the q, and p, in H(q,, p,) 
and making % > 0 if it occurs in H(q,,p,). The classical Hamiltonian 
H, is, of course, a real function of its variables. It is usually a 
quadratic function of the momenta p,, but not always so, the 
relativistic theory of a free particle being an example where it is not. 
The following argument is valid for H, any algebraic function of the p’s. 

We suppose that the time-dependent wave function in Schro- 
dinger’s representation is of the form 


p(qt) = Ae’sh, (35) 
where A and S are real functions of the q’s and ¢ which do not vary 
very rapidly with their arguments. The wave function is then of the 
form of waves, with A and S determining the amplitude and phase 
respectively. Schrédinger’s wave equation (7) gives 


“a & Actity = H(g,,p,) Aes) 


or ih AZ|) = eH (g,, pbc, (36) 


ae 0A 4 
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Now e~*Si is evidently a unitary linear operator and may be used for 
U in equation (70) of § 26 to give us a unitary transformation. The 
q’s remain unchanged by this transformation, each p, goes over into 


etl, Sih — p,+-08/0q,, 
with the help of (31) of § 22, and H goes over into 
eSIhH(q,, p,)es” = H(q,,p, + 2S8/6q,), 


since algebraic relations are preserved by the transformation. Thus 
(36) becomes 


OA os os 
ies: Aes =—H roe == A . Bi 
(G4 Sp = H(2-+Z)4> (37) 
Let us now suppose that / can be counted as small and let us neglect 
terms involving # in (37). This involves neglecting the p,’s that occur 
in H in (37), since each p, is equivalent to the operator —ih d/dq, 
operating on the functions of the q’s to the right of it. The surviving 


terms give eS ag 
ae In ag D 


rr 


(38) 


This is a differential equation which the phase function S has to 
satisfy. The equation is determined by the classical Hamiltonian 
function H, and is known as the Hamilton-Jacobi equation in classical 
dynamics. It allows S to be real and so shows that the assumption 
of the wave form (35) does not lead to an inconsistency. 

To obtain an equation for A, we must retain the terms in (37 ) 
which are linear in % and see what they give. A direct evaluation of 
these terms is rather awkward in the case of a general function H, 
and we can get the result we require more easily by first multiplying 
both sides of (37) by the bra vector (Af, where f is an arbitrary real 
function of the q’s. This gives . 


~ OA as 7) 
Ain AG) = <ApH(an2,+ 35) 4). 
The conjugate complex equation is 
- OA os 7) 
Af} —ih— —A —)) = — 
afl i854) = <AB ope SS) fA. 
Subtracting and dividing out by 7%, we obtain 


KASS > = AL H(anD,+2°)|4>. mc 
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We now have to evaluate the P.B. 


Lf, H(q,.p,+28/éq,)].- 
Our assumption that # can be counted as small enables us to expand 
H(q,, p,+-2@S/aq,) as a power series in the p’s. The terms of zero degree 
will contribute nothing to the P.B. The terms of the first degree in 
the p’s give a contribution to the P.B. which can be evaluated most 
easily with the help of the classical formula (1) of § 21 (this formula 
being valid also in the quantum theory if u is independent of the p’s 
and v is linear in the p’s). The amount of this contribution is 
of |= eB!) 
=~ 29s OP, I p,=0Slea, 
the notation meaning that we must substitute @S/dq, for each p, in 
the function [ ] of the q’s and p’s, so as to obtain a function of the q’s 
only. The terms of higher degree in the p’s give contributions to the 
P.B. which vanish when # > 0. Thus (39) becomes, with neglect of 
terms involving %, which is equivalent to the neglect of #7 in (37), 


, OA? of ee Pal 
— (A2 A ner NATL, A 
f ct ? ‘ res Ops pr=0S8]leg, 


Now if a(g) and 6(g) are any two functions of the q’s, formula 
(64) of $20gives — ¢q(q)b(q)> = f ata’) dq’ 0(q’). 


(40) 


and so aig» - —~ SM vay, (41) 


provided a(g) and 6(q) satisfy suitable boundary conditions, as dis- 
cussed in §§ 22 and 23. Hence (40) may be written 


Sop — SZ ae ae as!” 


Ups 
Since this holds for an arbitrary real function f, we must have 
oA? — ae Dele . (42) 
at 3 04, Ops pr=OSear 


This is the equation for the amplitude A of the wave function. To 
get an understanding of its significance, let us suppose we have a fluid 
moving in the space of the variables q, the density of the fluid at any 
point and time being A? and its velocity being 


dq, _ | ae P| 
dt Op, pr=0S!0q, 


(43) 
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Equation (42) is then just the equation of conservation for such a 
fluid. The motion of the fluid is determined by the function S 
satisfying (38), there being one possible motion for each solution 
of (38). 

For a given 8S, let us take a solution of (42) for which at some 
definite time the density A? vanishes everywhere outside a certain 
small region. We may suppose this region to move with the fluid, 
its velocity at each point being given by (43), and then the equation 
of conservation (42) will require the density always to vanish outside 
the region. There is a limit to how small the region may be, imposed 
by the approximation we made in neglecting # in (39). This approxi- 
mation is valid only provided 

é as 

Lo _ aq, 

- Lad 2108 
A oq, ~h aq,’ 

which requires that A shall vary by an appreciable fraction of itself 
only through a range of the q’s in which S varies by many times %, 
i.e. a range consisting of many wavelengths of the wave function (35). 
Our solution is then a wave packet of the type discussed in § 24 and 
remains so for all time. 

We thus get a wave function representing a state of motion for 
which the coordinates and momenta have approximate numerical 
values throughout all time. Such a state of motion in quantum 
theory corresponds to the states with which classical theory deals. 
The motion of our wave packet is determined by equations (38) and 
(43). From these we get, defining p, as aS/éq,, 


dp, _d2S_ a8 a8 dq, 
dt dt éq, —atéq, “1 84, 24g ‘dt 2 
a as 29 2H,(q,,P,) 
= —— af aa) a etna Er) 
ag, “\*" aq.) 7 yy Oy 
_ _ 984 P,) 
a | — 


where in the last line the p’s are counted as independent of the q’s 
before the partial differentiation. Equations (43) and (44) are just 
the classical equations of motion in Hamiltonian form and show that 
the wave packet moves according to the laws of classical mechanics. 
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We see in this way how the classical equations of motion are derivable 
from the quantum theory as a limiting case. 

By a more accurate solution of the wave equation one can show 
that the accuracy with which the coordinates and momenta simul- 
taneously have numerical values cannot remain permanently as 
favourable as the limit allowed by Heisenberg’s principle of un- 
certainty, equation (56) of § 24, but if it is initially so it will become 
less favourable, the wave packet undergoing a spreading.t 


32. The action principlet 

Equation (10) shows that the Heisenberg dynamical variables at 
time t, v,, are connected with their values at time to, v,, or v, by a 
unitary transformation. The Heisenberg variables at time ¢+8¢ are 
connected with their values at time ¢ by an infinitesimal unitary 
transformation, as is shown by the equation of motion (11) or (13), 
which gives the connexion between »,,3, and v, of the form of (79) or 
(80) of § 26 with H, for F and 8t/h for «. The variation with time of 
the Heisenberg dynamical variables may thus be looked upon as the 
continuous unfolding of a unitary transformation. In classical 
mechanics the dynamical variables at time t+6t are connected with 
their values at time t by an infinitesimal contact transformation and 
the whole motion may be looked upon as the continuous unfolding of a 
contact transformation. We have here the mathematical foundation 
of the analogy between the classical and quantum equations of 
motion, and can develop it to bring out the quantum analogue of all 
the main features of the classical theory of dynamics. 

Suppose we have a representation in which the complete set of 
commuting observables é are diagonal, so that a basic bra is <é’|. 
We can introduce a second representation in which the basic bras are 

Ce = EIT. (45) 
The new basic bras depend on the time ¢ and give us a moving 
representation, like a moving system of axes in an ordinary vector 
space. Comparing (45) with the conjugate imaginary of (8), we see 
that the new basic vectors are just the transforms in the Heisenberg 
picture of the origina] basic vectors in the Schrédinger picture, and 
hence they must be conne..’ed with the Heisenberg dynamical 


+ See Kennard, Z. f. Physik, 44 (1927), 344; Darwin, Proc. Roy. Soc. A, 117 (1927), 


268, 
- This section may be omitted by the student who is not specially concerned with 


higher dynamics. 


126 THE EQUATIONS OF MOTION § 32 


variables v, in the same way in which the original basic vectors are 
connected with the Schrédinger dynamical variables v. In particular, 
each <é’*| must be an eigenvector of the &,s belonging to the eigen- 
values ¢’. It may therefore be written <é;|, with the understanding | 
that the numbers €; are the same eigenvalues of the &,'s that the £’’s 
are of the é’s. From (45) we get 


CEE = EITIE, (46) 
showing that the transformation function is just the representative 
of 7' in the original representation. 

Differentiating (45) with respect to ¢ and using (6), we get 


Sida: 4 per oa A he ; 7 ; 
the El = nh io [HT = <&;| A, 


with the help of (12). Multiplying on the right by any ket |a> 
independent of t, we get : 


in 5 <Ellad = <eiha> = f <EIIELIeD dE; <Eslay, (4 


if we take for definiteness the case of continuous eigenvalues for the 
é’s. Now equation (5), written in terms of representatives, reads 


in © '|Pty = i <e/|H\e"> dé" <2"| Pty. (48) 


Since <£;|H,|€> is the same function of the variables £ and & that 
<é'|H|€"> is of &’ and &’, equations (47) and (48) are of precisely the 
same form, with the variables €;,& in (47) playing the role of the 
variables €’ and &” in (48) and the function <é;|a>) playing the role 
of the function <€’|Pt). We can thus look upon (47) as a form of 
Schrédinger’s wave equation, with the function <&;|a) of the variables . 
&; as the wave function. In this way Schrédinger’s wave equation 
appears in a new light, as the condition on the representative, in the 
moving representation with the Heisenberg variables &, diagonal, of the 
fixed ket corresponding to a state in the Heisenberg picture. The function 
<&|a> owes its variation with time to its left factor <¢/|, in contra- 
- distinction to the function <é’| Pt), which owes its variation with time 
to its right factor | Pt). 
If we put |a> = |&") in (47), we get 


G d tien , Hh tH Wy en 
SEAR) — | <EIETD ae Cee", (49) 
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showing that the transformation function <&\E"> satisfies Schré- 
dinger’s wave equation. Now £,, = €, so we must have 

<SslE"> = 8(E,,—€"), (50) 
the 6 function here being understood as the product of a number of 
factors, one for each €-variable, such as occurs for the variables 
Ey41)-€, on the right-hand side of equation (34) of § 16. Thus the 
transformation function ¢£/|é”> is that solution of Schrédinger’s wave 
equation for which the £’s certainly have the values é” at time t, 
The square of its modulus, |<;|é”>|®, is the relative probability of the 
€’s having the values ¢ at time ¢ > t, if they certainly have the values 
" at time ft). We may write <£,|é"> as <é;,|{,> and consider it as 
depending on f, as well as on ¢. To get its dependence on ¢, we take 
the conjugate complex of equation (49), interchange ¢ and ¢, and also 
interchange single primes and double primes. This gives 


~ihe ie = | CEE) dee CEI, IE). (51) 


The foregoing discussion of the transformation function <&|£”) is 
valid with the é’s any complete set of commuting observables. The 
equations were written down for the case of the é’s having continuous 
eigenvalues, but they would still be valid if any of the ¢’s have 
discrete eigenvalues, provided the necessary formal changes are made 
in them. Let us now take a dynamical system having a classical 
analogue and let us take the €’s to be the coordinates g. Put 

ulg"> = esi (52) 
and so define the function S of the variables q;,q”. This function also 
depends explicitly on ¢. (52) is a solution of Schrédinger’s wave 
equation and, if # can be counted as small, it can be handled in the 
same way as (35) was. The S of (52) differs from the S of (35) on 
account of there being no A in (52), which makes the S of (52) com- 
plex, but the real part of this S equals the S of (35) and its pure 
imaginary part is of the order #. Thus, in the limit % > 0, the S of 
(52) will equal that of (35) and will therefore satisfy, corresponding 


to (38), —aS/ét = Ady Pi); i) 
where Pu = 8/64n ve) 


and H, is the Hamiltonian of the classical analogue of our quantum 
dynamical system. But (52) is also a solution of (51) with q’s for ¢’s, 
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which is the conjugate complex of Schrédinger’s wave equation in the 
variables gq” or gj. This causes S to satisfy alsot 
2S aly = Hd, P9) (65) 
where p, = —O8/éeq;. (56) 
The solution of the Hamilton-Jacobi equations (53), (55) is the 
action function of classical mechanics for the time interval ¢, to ¢, 
i.e. it is the time integral of the Lagrangian L, 


: 
S= | Leva. (57) 
to 


Thus the S defined by (52) is the quantum analogue of the classical action 
function and equals it in the imit kh > 0. To get the quantum analogue 
of the classical Lagrangian, we pass to the case of an infinitesimal 
time interval by putting t = t+8¢ and we then have <qj,,5,|q;,> a8 the 
analogue of e454, For the sake of the analogy, one should consider 
L(t)) as a function of the coordinates gq’ at time f)+6¢ and the co- 
ordinates q” at time ¢), rather than as a function of the coordinates 
and velocities at time tj, as one usually does. 

The principle of least action in classical mechanics says that the 
action function (57) remains stationary for small variations of the tra- 
jectory of the system which do not alter the end points, i.e. for small 
variations of the q’s at all intermediate times between f, and ¢ with q,, 
and q, fixed. Let us see what it corresponds to in the quantum theory. 


ty 
Put exp|i [20 art} = exp{iS(t,, t,)/B} = Bltysta), (58) 
tg 


so that B(t,,t,) corresponds to <q;,\q;,> in the quantum theory. (We 
here allow g;, and gj, to denote different eigenvalues of q,, and q,,, to 
save having to introduce a large number of primes into the analysis.) 
Now suppose the time interval é, >t to be divided up into a lurge 
number of small time intervals fy > t,, t, > ta,..., tm— > tms tm > t, by 
the introduction of a sequence of intermediate times ¢,, ta,..., t,. Then 


The corresponding quantum equation, which follows from the pro- 
perty of basic vectors (35) of § 16, is 
<dilao> = [ff <aélaim> dain dinlIm-a> Wm—1---<dala> 4g3<q4 90), 
(60) 


} For a more accurate comparison of transformation functions with classical 
theory, see Van Vleck, Proc. Nat. Acad. 14, 178. 
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g;, being written for gj, for brevity. At first sight there does not seem 
to be any close correspondence between (59) and (60). We must, 
however, analyse the meaning of (59) rather more carefully. We must 
regard each factor B as a function of the q’s at the two ends of the 
time interval to which it refers. This makes the right-hand side of 
(59) a function, not only of gq, and q,,, but also of all the intermediate 
q’s. Equation (59) is valid only when we substitute for the inter- 
mediate q’s in its right-hand side their values for the real trajectory, 
small variations in which values leave S stationary and therefore also, 
from (58), leave B(t,t,) stationary. It is the process of substituting 
these values for the intermediate qg’s which corresponds to the inte- 
grations over all values for the intermediate q’’s in (60). The quantum 
analogue of the action principle is thus absorbed in the composition 
law (60) and the classical requirement that the values of the inter- 
mediate qg’s shall make S stationary corresponds to the condition 
in quantum mechanics that all values of the intermediate q’’s 
are important in proportion to their contribution to the integral 
in (60). 

Let us see how (59) can be a limiting case of (60) for 4 small. We 
must suppose the integrand in (60) to be of the form e*¥/", where F is 
a function of 9), 91, Y2:---> Ym» 4 Which remains continuous as % tends 
to zero, so that the integrand is a rapidly oscillating function when 
# is small. The integral of such a rapidly oscillating function will be 
extremely small, except for the contribution arising from a region in 
the domain of integration where comparatively large variations in 
the g;, produce only very small variations in F. Such a region must 
be the neighbourhood of a point where F is stationary for small varia- 
tions of the q;. Thus the integral in (60) is determined essentially by 
the value of the integrand at a point where the integrand is stationary 
for small variations of the intermediate q’’s, and so (60) goes over 
into (59). 

Equations (54) and (56) express that the variables qj, p; are con- 
nected with the variables g’,p” by a contact transformation and are 
one of the standard forms of writing the equations of a contact trans- 
formation. There is an analogous form for writing the equations of a 
unitary transformation in quantum mechanics. We get from (52), with 
the help of (45) of § 22, 

OS(M, 


‘ oo a cin 
CG|Pulf > = tha Geld >= a =“<Hlq">- (61) 
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Similarly, with the help of (46) of § 22,. 


’ (Reef 7] QW OS(s q') ow 62 

G1Pr|9"> = tee ald 2a 

From the general definition of functions of commuting observables, 
Soa <alflado(ala"> = Saidg(a")<atla”>, (63) 


where f(q,) and g(q) are functions of the q,’s and q’s respectively. Let 
G(qy,q) be any function of the q,’s and q’s consisting of a sum or 
integral of terms each of the form f(q,)g(q), so that all the q,’s in G 
occur to the left of all the g’s. Such a function we call well ordered. 
Applying (63) to each of the terms in G and adding or integrating, 


— <iq 9) I9"> = i, 9")<atla”>- 
Now let us suppose each p,, and p, can be expressed as a well-ordered 
function of the g,’s and q’s and write these functions p,,(q), q), PAY Q)- 
Putting these functions for G, we get 

<G|Prld"> = PrlQi 9" )<Gl9">, 


<GPrld"> = Pr(Q9")<GIg">- 
Comparing these equations with (61) and (62) respectively, we see 


that ton eee) 
> ony OS(G,9") ry __ 9S(G%,g") 
Pll Qs Y ) — 0g, > Prl Qs Y ) = a an 
This means that 
OS(%, 9) S(%,9) 
me, eA) Se el 4 
Pr 5 OO ? DP; ; aq, > (6 ) 


provided the right-hand sides of (64) are written as well-ordered 
functions. 

These equations are of the same form as (54) and (56), but refer to 
the non-commuting quantum variables q,,q instead of the ordinary 
algebraic variables q;,q". They show how the conditions for a unitary 
transformation between quantum variables are analogous to the condi- 
tions for a contact transformation between classical variables. The 
analogy is not complete, however, because the classical 8 must be real 
and there is no simple condition corresponding to this for the S of (64). 


33. The Gibbs ensemble 

~ In our work up to the present we have been assuming all along that 
our dynamical system at each instant of time is in a definite state, 
that is to say, its motion is specified as completely and accurately as 
is possible without conflicting with the general principles of the theory 
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In the classical theory this would mean, of course, that all the coordi- 
nates and momenta have specified values. Now we may be interested 
in a motion which is specified to a lesser extent than this maximum 
possible. The present section will be devoted to the methods to be 
used in such a case. 

The procedure in classical mechanics is to introduce what is called 
a Gibbs ensemble, the idea of which is as follows. We consider all the 
dynamical coordinates and momenta as Cartesian coordinates in a 
certain space, the phase space, whose number of dimensions is twice 
the number of degrees of freedom of the system. Any state of the 
system can then be represented by a point in this space. This point 
will move according to the classical equations of motion (14). Sup- 
pose, now, that we are not given that the system is in a definite state 
at any time, but only that it is in one or other of a number of possible 
states according to a definite probability law. We should then be 
able to represent it by a fluid in the phase space, the mass of fluid in 
any volume of the phase space being the total probability of the 
system being in any state whose representative point lies in that 
volume. Each particle of the fluid will be moving according to the 
equations of motion (14). If we introduce the density p of the fluid 
at any point, equal to the probability per unit volume of phase space 
of the system being in the neighbourhood of the corresponding state, 
we shall have the equation of conservation 


Op _ Of day, 2 “| 
a> — Dlegl at) tala) 


=~ Dlen| se) ~ ("an 
= (2q,\ &p,} Op,\ 09, 
= —[p,H]. (65) 
This may be considered as the equation of motion for the fluid, since 
it determines the density p for all time if p is given initially as a 
function of the q’s and p’s. It is, apart from the minus sign, of the 
same form as the ordinary equation of motion (15) for a dynamical 


variable. 
The requirement that the total probability of the system being in 
any state shall be unity gives us a normalizing condition for p 


[fe dadp =1, (66) 


the integration being over the whole of phase space and the single 
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differential dq or dp being written to denote the product of all the 
dq’s or dp’s. If 8 denotes any function of the dynamical variables, 
the average value of 8 will be 


| J Be dade. (67) 
It makes only a trivial alteration in the theory, but often facilitates 


discussion, if we work with a density p differing from the above one 
by a positive constant factor, k say, so that we have instead of (66) 


[fe dadp = k. 


With this density we can picture the fluid as representing a number 
k of similar dynamical systems, all following through their motions 
independently in the same place, without any mutual disturbance or 
interaction. The density at any point would then be the probable or 
average number of systems in the neighbourhood of any state per unit 
volume of phase space, and expression (67) would give the average 
total value of 8 for all the systems. Such a set of dynamical systems, 
which is the ensemble introduced by Gibbs, is usually not realizable 
in practice, except as a rough approximation, but it forms all the 
same a useful theoretical abstraction. 

We shall now see that there exists a corresponding density p 
in quantum mechanics, having properties analogous to the above. 
It was first introduced by von Neumann. Its existence is rather 
surprising in view of the fact that phase space has no meaning in 
quantum mechanics, there being no possibility of assigning numerical 
values simultaneously to the q’s and p’s. 

We consider a dynamical system which is at a certain time in one 
or other of a number of possible states according to some given 
probability law. These states may be either a discrete set or a con- 
tinuous range, or both together. We shall here take for definiteness 
the case of a discrete set and suppose them labelled by a parameter m. 
Let the normalized ket vectors corresponding to them be |m) and let 
the probability of the system being in the mth state be P.,. We then 
define the quantum density p by 

piss Ds |m>P,<m|. (68) 


Let p’ be any eigenvalue of p and |p’> an eigenket belonging to this 
eigenvalue. Then 


2 |m>P,,<m|p"> = p|p’> = p’|p’> 
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so that > <p |m>P,,<m|p"> = p'<p'|p’> 
or & Faldmnip>? = p’<e'lp’>. 


Now FP,,, being a probability, can never be negative. It follows that 
p’ cannot be negative. Thus p has no negative eigenvalues, in analogy 
with the fact that the classical density p is never negative. 

Let us now obtain the equation of motion for our quantum p. In 
Schrédinger’s picture the kets and bras in (68) will vary with the time 
in accordance with Schrédinger’s equation (5) and the conjugate 
imaginary of this equation, while the P,,’s will remain constant, since 
the system, so long as it is left undisturbed, cannot change over from 
a state corresponding to one ket satisfying Schrédinger’s equation to 
a state corresponding to another. We thus have 


dp __ -r{d|m> d<m| 
us a lee m<m|+ |m>Pp “T| 


= 5 {H|m>P,,<m|—|m>P,,m|H} 


= Hp—pH. (69) 


This is the quantum analogue of the classical equation of motion 
(65). Our quantum p, like the classical one, is determined for all time 
if it is given initially. 

From the assumption of § 12, the average value of any observable 
8 when the system is in the state m is <m|B|m>. Hence if the system 
is distributed over the various states m according to the probability 
law P,,, the average value of 8 will be ¥ P,,<m|B|m>. If we introduce 

mm 


a representation with a discrete set of basic ket vectors |&’> say, this 
equals 

» P<m|f"><€"|Blm> = - <E"|Bim>P,,<m|é"> 

me ‘mM 


= » E |Bpig> = 2 E'lpBle>, (70) 
the last step being easily verified with the law of matrix multiplica- 
tion, equation (44) of §17. The expressions (70) are the analogue of 
the expression (67) of the classical theory. Whereas in the classical 
theory we have to multiply 8 by p and take the integral of the 
product over all phase space, in the quantum theory we have to 
multiply B by p, with the factors in either order, and take the 
3595.57 K 
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diagonal sum of the product in a representation. If the representa- 
tion involves a continuous range of basic vectors |é’>, we get instead 


a [ <e'IBle’> de’ = f <e'[pBle’> a, m) 


so that we must carry through a process of ‘integrating along the 
diagonal’ instead of summing the diagonal elements. We shall define 
(71) to be the diagonal sum of Bp in the continuous case. It can easily 
be verified, from the properties of transformation functions (56) of 
§ 18, that the diagonal sum is the same for all representations. 

From the condition that the |m>’s are normalized we get, with 
discrete €’’s 


2 <e'lplE> =~ MOE, mE = > F, = 1, (72) 


since the total probability of the system being in any state is unity. 
This is the analogue of equation (66). The probability of the system 
being in the state €’, or the probability of the observables € which 
are diagonal in the representation having the values €’, is, according 
to the rule for interpreting representatives of kets (51) of § 18, 


= XS" |m> [PP = <E"lplé, (73) 


which gives us a meaning for each term in the sum on the left-hand 
side of (72). For continuous ’’s, the right-hand side of (73) gives the 
probability of the é’s having values in the neighbourhood of £’ per 
unit range of variation of the values é’. 

As in the classical theory, we may take a density equal to k times 
the above p and consider it as representing a Gibbs ensemble of k 
similar dynamical systems, between which there is no mutual dis- 
turbance or interaction. We shall then have & on the right-hand side 
of (72), and (70) or (71) will give the total average f for all the 
members of the ensemble, while (73) will give the total probability 
of a member of the ensemble having values for its é’s equal to ¢’ 
or in the neighbourhood of €’ per unit range of variation of the 
values €’. 

An important application of the Gibbs ensemble is to a dynamical 
system in thermodynamic equilibrium with its surroundings at a 
given temperature JT’. Gibbs showed that such a system is repre- 
sented in classical mechanics by the density 


p= ae (74) 
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H being the Hamiltonian, which is now independent of the time, k 
being Boltzmann’s constant, and ¢ being a number chosen to make 
the normalizing condition (66) hold. This formula may be taken over 
unchanged into the quantum theory. At high temperatures, (74) 
becomes p = c, which gives, on being substituted into the right-hand 
side of (73), c<é’ |é’) = ¢ in the case of discrete é’s. This shows that 
at high temperatures all discrete states are equally probable. 


7 
ELEMENTARY APPLICATIONS 

34. The harmonic oscillator 

A SIMPLE and interesting example of a dynamical system in quantum 
mechanies is the harmonic oscillater. This example is of importance 
for general theery, because it forms a corner-stone in the theory of 
radiation. The dynamical variables needed for describing the system 
are just one coordinate g and its conjugate momentum p. The 
Hamiltonian in classical mechanics is 


i 
H =~ (p*+mto%"), (1) 


where m is the mass of the oscillating particle and w is 27 times the 
frequency. We assume the same Hamiltonian in quantum mechanics. 
This Hamiltonian, together with the quantum condition (10) of § 22, 
detine the system completely. 
The Heisenberg equations of motion are 
dy = [ae E] = pom, }@ 
Bb: = [pH] = —me*g,. 
It is convenient to introduce the dimensionless complex dynamical 
‘vetiaiinie 2 = (Qmiico) (p+ imag). (3) 
The equations of motion (2) give 
= (mio) +(—mar*g,-Fiag,) = ton, 
This equation can be integrated te give 
w= me, (4) 
where yo is a linear operator independent of #, and is equal to the 
value of », at time i= 0. The above equations are all as in the 
classieal theory. 
We ean express g and p in terms of » and its conjugate complex 7 
and may thus work entirely in terms of » and 9. We have 
Reon® = (2m)-\(p+-imag)(p—imag) 
== (2m)~[p?-mtw'g? + imeo(gp—pg)] 
= H—Yie (5) 
and similarly fein = H+Hie. (6) 
Thus q7—7F = 1. (7) 
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Equation (5) or (6) gives H in terms of 7 and # and (7) gives the 
commutation relation connecting 7 and 7. From (5) 


havin = 9H — Yoh 


and from (6) hwing = H4+hiw5. 
Thus 7H — HA = hog. (8) 
Also, (7) leads to An" — ny = ny} (9) 


for any positive integer n, as may be verified by induction, since, by 
multiplying (9) by 7 on the left, we can deduce (9) with n+1 for n. 

Let H’ be an eigenvalue of H and |H’) an eigenket belonging to it. 
From (5) 

hw H'|n7j|H'> = CH'|H — Yiw|H"> = (H' — fiw) |H’ >. 

Now <H'|n7|H’> is the square of the length of the ket 7|H’), and 
satel <H' \nii|H’> > 0, 
the case of equality occurring only if 7|H’> = 0. Also “H’|H’, > 0. 
= H’ > Vio», (10) 
the case of equality occurring only if 7|H’> = 0. From the form (1) 
of H as a sum of squares, we should expect its eigenvalues to be all 
positive or zero (since the average value of H for any state must be 
positive or zero). We now have the more stringent condition (10). 

From (8) 

HAH’) = (4H hw) |H'> = (H'—hw)9\H’. (11) 
Now if H’ + }hw, 7|H’> is not zero and is then according to (11) an 
eigenket of H belonging to the eigenvalue H’--hw. Thus, with H’ 
any eigenvalue of H not equal to }iw, H’—hw is another eigenvalue 
of H. We can repeat the argument and infer that, if H’—hw $+ }hw, 
H’—2%w is another eigenvalue of H. Continuing in this way, we 
obtain the series of eigenvalues H’, H’—hw, H’—2hw, H'—3hw...., 
which cannot extend to infinity, because then it would contain eigen- 
values contradicting (10), and can terminate only with the value }iw. 
Again, from the conjugate complex of equation (8) 
Hin|H'> = (nH +hen)|H"> = (H'+hw)n|H"), 

showing that H’+fw is another eigenvalue of H, with »|H’> as an 
eigenket belonging to it, unless n|H’> = 0. The latter alternative 
can be ruled out, since it would lead to 


0 = hwijn|H"> = (H+ Hiew)|H'> = (H’+ How) |H’>, 
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which contradicts (10). Thus H’+fiw is always another eigenvalue 
of H, and so are H’+ 2hw, H’+3hw and so on. Hence the eigenvalues 
of H are the series of numbers 
tho, show, 8hw, hw, .... (2) 
extending to infinity. These are the possible energy values for the 
harmonic oscillator. 
Let |0> be an eigenket of H belonging to the lowest eigenvalue 
thw, so that 70) 0s (13) 


and form the sequence of kets 


>, 9/0>, 4/0, 809, (14) 
These kets are all eigenkets of H, belonging to the sequence of eigen- 
values (12) respectively. From (9) and (13) 

In" |0> = nq"-3[0 (15) 
for any non-negative integer n. Thus the set of kets (14) is such that 
7 or 4 applied to any one of the set gives a ket dependent on the set. 
Now all the dynamical variables in our problem are expressible in terms 
of 7 and 7, so the kets (14) must form a complete set (otherwise there 
would be some more dynamical variables). There is just one of these 
kets for each eigenvalue (12) of H, so H by itself forms a complete 
commuting set of observables. The kets (14) correspond to the various 
stationary states of the oscillator. The stationary state with energy 
(n+ 4)hw, corresponding to ”|0), is called the nth quantum state. 

The square of the length of the ket 7"|0) is 


<0|9%"|0> = n<0|q_"1y%-1|0) 

with the help of (15). By induction, we find that 
<0|75"7"|0> = n! (16) 
provided |0> is normalized. Thus the kets (14) multiplied by the 
coefficients n!-+ with m = 0,1, 2...., respectively form the basic kets 


of a representation, namely the representation with H diagonal. Any 
ket |z> can be expanded in the form 


la) = yn, 70>, (17) 


where the z,’s are numbers. In this way the ket |z> is put into 
_ correspondence with a power series > x, 7” in the variable n, the 
various terms in the power series corresponding to the various 
stationary states. If |x) is normalized, it defines a state for which 
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the probability of the oscillator being in the nth quantum state, 
i.e. the probability of H having the value (n+4)hw, is 

Bs aa n} baal (18) 
as follows from the same argument which led to (51) of § 18. 

We may consider the ket |0) as a standard ket and the power series 
in 7 as a wave function, since any ket can be expressed as such a 
wave function multiplied into this standard ket. We get a kind of 
wave function differing from the usual kind, introduced by equations 
(62) of § 20, in that it is a function of the complex dynamical variable 
n instead of observables. It was first introduced by V. Fock, so we 
shall call the representation Fock’s representation. It is for many 
purposes the most convenient representation for describing states of 
the harmonic oscillator. The standard ket |0) satisfies the condition 
(13), which replaces the conditions (43) of § 22 for the standard ket 
in Schrédinger’s representation. 

Let us introduce Schrédinger’s representation with q diagonal and 
obtain the representatives of the stationary states. From (13) and (3) 


(p—imwq)|0> = 0, 


so <q’ |p—imwg|0> = 0. 
With the help of (45) of § 22, this gives 
‘ 0 / , , 
7g SO tend <q'|0> = 90. (19) 
The solution of this differential equation is 
<q’|0> = (rres/mh)te-mon "2h, (20) 


the numerical coefficient being chosen so as to make |0) normalized. 
We have here the representative of the normal state, as the state of 
lowest energy is called. The representatives of the other stationary 
states can be obtained from it. We have from (3) 


<q/|n"|0> = (2mhiw)-"?¢q' |(p+tmwg)”|0> 
n 
- (2mfie)-Rin{ —h i 4 mao’ ¢q’|0) 
é 
0q' 
This may easily be worked out for small values of'n. The result is of 
the form of e-™7"/2 times a pc-ver series of degree n ing’. A further 


factor n!-* must be inserted in (21) to get the normalized representa- 
tive of the nth quantum state. The phase factor i" may be discarded. 


= im(2mha)-M(rne/nh)( —A +-maig’)" emo, (21) 
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35. Angular momentum 

Let us consider a particle described by the three Cartesian coordi- 
nates x, y, z and their conjugate momenta p,, p,, p,. Its angular 
momentum about the origin is defined as in the classical theory, by 


nN, = YP, —ZPy mM, = 2p,— Xp, Ukr = XPy—YP yz; (22) 
or by the vector equation 
m= xxXp. 
We must evaluate the P.B.s of the angular momentum components 
with the dynamical variables x, p,, etc., and with each other. This 
we can do most conveniently with the help of the laws (4) and (5) of 
§ 21, thus 
[m,,«] = [xpy—yP2,2] = —y[ P22] = y, 


[m.,y] = [=py—yP2»¥] = apy] = — 


[m,, z| = [xpy—YPz, z| Sale (24) 
and similarly, 


(23) 


[m,, Pz] = Py [m py | = — = Drs (25) 
[m,, Pz] = 9, (26) 


with corresponding relations for m, and m,. Again 
[m,; m,| eas [2P,— Pz, m,| aa a Po m,|—[s, m,|p, 
= —2Pyt+YP, = Mz, 


(27) 
[gg tte,|| args 


These results are all the same as in the classical theory. The sign in 
the results (23), (25), and (27) may easily be remembered from the 
rule that the + sign occurs when the three dynamical variables, con- 
sisting of the two in the P.B. on the left-hand side and the one 
forming the result on the right, are in the cyclic order (xyz) and the 
— sign occurs otherwise. Equations (27) may be put in the vector 
form 


mxm = iim. (28) 

Now suppose we have several particles with angular momenta 

m,,m,,.... Hach of these angular momentum vectors will satisfy 
(28), thus 


m,xXm, = iim,, 
and any one of them will commute with any other, so that 


m.xm,+m,xm,=0 (rs). 
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Hence if M = } m, is the total angular momentum, 
MxM = >) m,xm, = > m,xm,+ 3 (m,xm,+m, x m,) 
Ts r rss 
= ih ¥ m, = ihM. (29) 


This result is of the same form as (28), so that the components of the 
total angular momentum M of any number of particles satisfy the 
same commutation relations as those of the angular momentum of 
a single particle. 

Let A,, A,, A, denote the three coordinates of any one of the 
particles, or else the three components of momentum of one of 
the particles. The A’s will commute with the angular momenta of 
the other particles, and hence from (23), (24), (25), and (26) 


[M,, A,] os As, [ M,, us| a Ay [M,, A,| = 0. (30) 


If B,, B,, B, are a second set of three quantities denoting the 
coordinates or momentum components of one of the particles, they 
will satisfy similar relations to (30). We shall then have 


[I,, A, BoA, By-t Ay B,] 
= [M,, ABs A[ MM, B,|+[™, A, | By Aas B,] 
= A, B,+A, B,—A, By~A, 5. 
==30! 


Thus the scalar product A, B,+A,B,4+A,B, commutes with M,, 
and similarly with M@, and M,. Introduce the vector product 


AxB=C 
or 
A, B,—A,B, = C,, A, B,—A,B, = C,, A, B,—A, B, = G,. 
We have [i4,,C,] = —A, B,+-A,B, = C, 


-and similarly [M,,C,] = —C,, (ae C)— 0. 


These equations are again of the form (30), with C for A. We can 
conclude from this work that equations of the form (30) hold for the 
three components of any vector that we can construct from our 
dynamical variables, and that any scalar commutes with M. 

We can introduce linear operators F referring to rotations about 
the origin in the same way in which we introduced the linear operators 
D in § 25 referring to displacements. Taking a rotation through an 


142 ELEMENTARY APPLICATIONS § 35 


angle 5¢ about the z-axis and making $¢ infinitesimal, we can obtain 
the limit operator corresponding to (64) of § 25, 


Jim (R—1)/54, 


which we shall call the rotation operator about the z-axis and denote 
by r,. Like the displacement operators, 7, is a pure imaginary linear 
operator and is undetermined to the extent of an arbitrary additive 
pure imaginary number. Corresponding to (66) of § 25, the change 
in any dynamical variable v caused by a rotation through a small 
angle 5¢ about the z-axis is 
54(r, v—vr,), (31) 
to the first order in 5¢. Now the changes produced in the three 
components A,, A,, A, of a vector by a (right-handed) rotation 3¢ 
about the z-axis applied to all measuring apparatus are $¢4,, 
—8¢A,, and 0 respectively, and any scalar quantity is unchanged by 
the rotation. Equating these changes to (31), we find that 
1, Ay— A,r, =wAy, 1,4,—A,%, = —A 
r,A,—A,r, = 0, 
and r, commutes with any scalar. Comparing these results with (30), 
we see that ihr, satisfies the same commutation relations as M,. 
Their difference, M,—ihr,, commutes with all the dynamical variables 
and must therefore be a number. This number, which is necessarily 
real since M, and ir, are real, may be made zero by a suitable choice 


of the arbitrary pure imaginary number that can be added to r,. We 
then have the result M, = ihr,. (32) 


Similar equations hold for M, and M,. They are the analogues of (69) 
of § 25, Thus the total angular momentum is connected with the rota- 
tion operators as the total momentum is connected with the displacement 
operators. This conclusion is valid for any point as origin. 

The above argument applies to the angular momentum arising 
from the motion of particles, defined by (22) for each particle. There 
is another kind of angular momentum occurring in atomic theory, 
spin angular momentum. The former kind of angular momentum will 
be called orbital angular momentum, to distinguish it. The spin angu- 
lar momentum of a particle should be pictured as due to some internal 
motion of the particle, so that it is associated with different degrees 
of freedom from those describing the motion of the particle as a whole, 


ee) 
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and hence the dynamical variables that describe the spin must com- 
mute with x, y, z, p,, p,, and p,. The spin does not correspond very 
closely to anything in classical mechanics, so the method of classical 
analogy is not suitable for studying it. However, we can build up a 
theory of the spin simply from the assumption that the components 
of the spin angular niomentum are connected with the rotation opera- 
tors in the same way as we had above for orbital angular momentum, 
i.e. equation (32) holds with M, as the z component of the spin angular 
momentum of a particle and r, as the rotation operator about the 
z-axis referring to states of spin of that particle. With this assump- 
tion, the commutation relations connecting the components of the 
spin angular momentum M with any vector A referring to the spin 
must be of the standard form (30), and hence, taking A to be the 
spin angular momentum itself, we have equation (29) holding also 
for the spin. We now have (29) holding quite generally, for any sum 
of spin and orbital angular momenta, and also (30) will hold generally, 
for M the total spin and orbital angular momentum and A any vector 
dynamical variable, and the connexion between angular momentum 
and rotation operators will be always valid. 

As an immediate consequence of this connexion, we can deduce the 
law of conservation of angular momentum. For an isolated system, the 
Hamiltonian must be unchanged by any rotation about the origin, in 
other words it must be a scalar, so it must commute with the angular 
momentum about the origin. Thus the angular momentum is a 
constant of the motion. For this argument the origin may be any 
point. 

As a second immediate consequence, we can deduce that a state 
with zero total angular momentum is spherically symmetrical. The state 
will correspond to a ket |S>, say, satisfying 


M,|S> an M,|S> - M,|8» = 0, (33) 
and hence 7,|S> = r,|S> = 7,|S> = 0. 


This shows that the ket |S> is unaltered by infinitesimal rotations, 
and it must therefore be unaltered by finite rotations, since the latter 
can be built up from infinitesimal ones. Thus the state is spherically 
symmetrical. The result may be understood in this way: if a state has 
zero total angular momentum, the dynamical system is equally likely 
to have any orientation, and hence spherical symmetry occurs. It is 
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analogous to stating that if a state has zero total linear momentum, 
the system is equally likely to be anywhere in space. 

The converse result is also true, a spherically symmetrical state has 
zero total angular momentum. This is obvious physically, since angular 
momentum is of the nature of a vector and, if it is not zero, its existence 
must destroy the spherical symmetry. 

It should be noted that in (33) we have a ket |S) that is a ata 
taneous eigenket for non-commuting observables. This is usually not 
possible, but it is possible in the present special case, because the three 
equations (33) together with the commutation relations (29) do not 
lead to any inconsistency. 


36. Properties of angular momentum 

There are some general properties of angular momentum, deducible 
simply from the commutation relations between the three compo- 
nents. These properties must hold equally for spin and orbital angular 
momentum. Let m,, m,, m, be the three components of an angular 
momentum, and introduce the quantity 8 defined by 


B = m3+m2+mi. 
Since f is a scalar it must commute with m,, m,, and m,. Let us 
suppose we have a dynamical system for which m,, m,, m, are the 
only dynamical variables. Then 8 commutes with everything and 


‘must be a number. We can study this dynamical system on much 
the same lines as we used for the harmonic oscillator in § 34. 


Put M,—im, = 7. 
From the commutation relations (27) we get 
An = (m,+im,)(m,—im,) = m3-+-m3 —i(m,m,—m,m,) 


= B—m?+hm, (34) 
and similarly ny = B—m2—hm,,. (35) 
Thus An—nH = 2hm,. (36) 
Also. m,n—nm, = thm,—hm, = —hn. (37) 


We assume that the components of an angular momentum are 
observables and thus m, has eigenvalues. Let m; be one of them, 
and |m,> an eigenket belonging to it. From (34) 


Can lijn tmz> = <mi|B—m2-+-fim,|m,> = (B—mi2-+ tian,)<mi|m,). 
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The left-hand side here is the square of tie length of the ket !m;> 
and is thus greater than or equal to zero, the case of equality occur- 
ring if and only if y|m,> = 0. Hence 


p—mi2+ him, > 0, 


or B+4h? > (m,—4h)?. (38) 
Thus B+}? > 0. 
Defining the number k by 

kAMi = (B+)! = (m2-+-m3-+m2-+H?)t, (39) 


so that k > —4h, the inequality (38) becomes 

k+}h > |m,—hh| 
or k+h > mi, > —k. (40) 
An equality occurs if and only if y}m,> = 0. Similarly from (35) 

<mz\nqilm,> = (B—m?—hm,)<m,\m,), 
showing that p—m,—iim, > 0 
or k = mM, => —k—h, 
with an equality occurring if and only if 7|m,> = 0. This result 
combined with (40) shows that k > 0 and 
k>m,>—k, (41) 
with m, = k if q\m, = 0 and m, = —k if nlm = 0. 
From (37) 
Now if m, 4 —k, n|m;> is not zero and is then an eigenket of m, 
belonging to the eigenvalue m,—f. Similarly, if m,—h + —k,m,—2h 
is another eigenvalue of m,, and so on. We get in this way a series 
of eigenvalues m,,m,—h, m,—2i,..., which must terminate from (41), 
and can terminate only with the value —k. Again, from the conjugate 
complex of equation (37) 
m, Alm,» = (am,+-hA)\m,> = (m_+h)Alm,>, 

showing that m,+-4 is another eigenvalue of m, unless #|m,> = 0, in 
which case m, = k. Continuing in this way we get a series of eigen- 
values m,,m,+h,m,+ 2h,...; which must terminate from (41), and 
can terminate only with the value k. We can conclude that 2k is an 
integral multiple of i and that the eigenvalues of m, are 


k, kh, k=2h, ..., —k-+h. —k. (42) 
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The eigenvalues of m, and m, are the same, from symmetry. These 
eigenvalues are all integral or half odd integral multiples of h, accord- 
ing to whether 24 is an even or odd multiple of h. 

Let jmax> be an eigenket of m, belonging to the maximum eigen- 


value &, so that 7imax> = 0, -) 


and form the sequence of kets 
imax>, |max), 72|max)>, ...,  7*/max). (44) 


These kets are all eigenkets of m,, belonging to the sequence of eigen- 
values (42) respectively. The set of kets (44) is such that the operator 
7 applied to any one of them gives a ket dependent on the set (7 
applied to the last gives zero), and from (36) and (43) one sees 
that 3 applied to any one of the set also gives a ket dependent on the 
set. All the dynamical variables for the system we are now dealing 
with are expressible in terms of 7 and 7, so the set of kets (44) is a 
complete set. There is just one of these kets for each eigenvalue (42) 
of m,, 80 m, by itself forms a complete commuting set of observables. 
. Itis convenient to detine the magnitude of the angular momentum 
veeter m to be &, given by (39), rather than pt, because the possible 


values for k are 0, gh, a, MH, W, ..., (45) 


extending to infinity, while the possible values for ft are a more 
complicated set of numbers. 

Fora dynamical system involving other dynamical variables besides 
m,, m,, and m,, there may be variables that do not commute with 8. 
Then § is no longer a number, but a general linear operator. This 
happens for any orbital angular momentum (22). asx, yz. p,, py. and 

p, do net commute with 8. We shall assume that 8 is always an 
observable, and k can then be detined by (39) with the positive square 
root funetion and is also an observable. We shall call & so detined 
the magnitude of the angular momentum vector m in the general 
pase, The above analysis by which we obtained the eigenvalues of 
m, is still valid if we replace |m,> by a simultaneous eigenket |[k’m,> 
of the commuting observables * and m,, and leads to the result that 
the possible eigenvalues for k are the numbers (45), and for each 
eigenvalue k’ of k the eigenvalues of m, are the numbers (42) with %” 
substituted for k. We have here an example of a phenomenon which 
we have not met with previously, namely that with two commuting 
observables, the eigenvalues of one depend on what eigenvalue we 


’ 
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assign to the other. This phenomenon may be understood as the two 
observables being not altogether independent, but partially functions 
of one another. The number of independent simultaneous eigenkets 
of k and m, belonging to the eigenvalues k’ and m;, must be indepen- 
dent of m;, since for each independent |k’m!> we can obtain an 
independent |k’m?>, for any m3 in the sequence (42), by multiplying 
|k’m,> by a suitable power of 7 or 7. 

As an example let us consider a dynamical system with two angular 
momenta m, and m,, which commute with one another. If there are 
no other dynamical variables, then all the dynamical variables com- 
mute with the magnitudes k, and k, of m, and m,, so k, and k, are 
numbers. However, the magnitude K of the resultant angular 
momentum M = m,+m, is not a number (it does not commute 
with the components of m, and m,) and it is interesting to work out 
the eigenvalues of K. This can be done most simply by a method 
of counting independent kets. There is one independent simultaneous 
eigenket of m,, and m,, belonging to any eigenvalue m,, having one of 
the values k,, k, —h, k, —2h,..., —k, and any eigenvalue m,, having one 
of the values k,, k,—ii, k.—2h,..., —k,, and this ket is an eigenket 
of M, belonging to the eigenvalue M, = m,,+m),. The possible 
values of M;, are thus k,+h5,k,+kh.—h, ky +k,—2h,...,—k,—k,, and 
the number of times each of them occurs is given by the following 
scheme (if we assume for definiteness that 4, > kz), 


k,+k,, k,+k,—h, ky+ kg— Dy See k,—hk,, k,—kp.—h,... 


] 2 8. Ee Dlaget-1 ie 
eg ePige=h.....—ky hy 
y+ | a | 


Now each eigenvalue K’ of K will be associated with the eigenvalues 
K’, K'—h, K'—2h,..., —K’ for M,, with the same number of indepen- 
dent simultaneous eigenkets of K and M, for each of them. The total 
number of independent eigenkets of M, belonging to any eigenvalue 
M;, must be the same, whether we take them to be simultaneous 
eigenkets of m,, and m,, or simultaneous eigenkets of K and ©, i.e. 
it is always given by the scheme (46). It follows that the eigenvalues 
for K are 

Kythe, Hythe —hi, Wy thy—2h,  by—he, (47) 


and that for each of these eigenvalues for K and an eigenvalue for 
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M, going with it there is just one independent simultaneous eigenket 
of K and &,. 

The effect of rotations on eigenkets of angular momentum variables 
should be noted. Take any eigenket |J;> of the z component of total 
angular momentum for any dynamical system, and apply to it a small 
rotation through an angle 8¢ about the z-axis. It will change into 

(1+8¢4r,)|Mz> = (1—186.M,/h)|M,> 
with the help of (32). This equals : 
(1-i8$My/f)|M,> = e822) 

to the first order in 5¢. Thus |JZ;> gets multiplied by the numerical 
factor e~®¢™J%, By applying a succession of these small rotations, we 
find that the application of a finite rotation through an angle ¢ about 
the z-axis causes |M;) to get multiplied by e#”/*, Putting ¢ = 2, 
we find that an application of one revolution about the z-axis leaves 
|M’,> unchanged if the eigenvalue VM; is an integral multiple of # and 
causes |1;> to change sign if M; is half an odd integral multiple of #. 
Now consider an eigenket |K’> of the magnitude K of the total angu- 
lar momentum. If the eigenvalue K’ is an integral multiple of %, the 
possible eigenvalues of WV, are all integral multiples of / and the applica- 
tion of one revolution about the z-axis must leave |K’> unchanged. 
Conversely, if K’ is half an odd integral multiple of #, the possible eigen- 
values of M, are all half odd integral multiples of / and the revolution 
must change the sign of |K’>. From symmetry, the application of a 
revolution about any other axis must have the sanje effect on |K’> 
as one about the z-axis. We thus get the general result, the application 
of one revolution about any axis leaves a ket unchanged or changes its 
sign according to whether it belongs to eigenvalues of the magnitude of 
the total angular momentum which are integral or half odd integral 
multiples of h. A state, of course, is always unaffected by the revolu- 
tion, since a state is unaffected by a change of sign of the két corre- 
sponding to it. 

For a dynamical system involving only orbital angular momenta, 
a ket must be unchanged by a revolution about an axis, since we can 
set up Schrédinger’s representation, with the coordinates of all the 
particles diagonal, and the Schrédinger representative of a ket will 
get brought back to its original value by the revolution. It follows 
that the eigenvalues of the magnitude of an orbital angular momentum 
are always integral multiples of h. The eigenvalues of a component 
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of an orbital angular momentum are also always integral multiples 
of %. For a spin angular momentum, Schrédinger’s representation 
does not exist and both kinds of eigenvalue are possible. 


37. The spin of the electron 

Electrons, and also some of the other fundamental particles (pro- 
tons, neutrons) have a spin whose magnitude is 4%. This is found 
from experimental evidence, and also there are theoretical reasons 
showing that this spin value is more elementary than any other, even 
spin zero (see Chapter XI). The study of this particular spin is there- 
fore of special importance. 

For dealing with an angular momentum m whose magnitude is }%, 


it is convenient to put m = Hie (48) 
The components of the vector o then satisfy, from (27), 
0,0,—6,0, = 2to, 
0,0,—0,0, = 2t0,, (49) 


0, 0y—,0, = 2ta,. 
The eigenvalues of m, are $4 and — 3h, so the eigenvalues of a, are 1 
and —1, and o? has just the one eigenvalue 1. It follows that o? must 
equal 1, and similarly for of and o?, ie. 
o, =o, = o)'= 1. (50) 
We can get equations (49) and (50) into a simpler form by means of 
some straightforward non-commutative algebra. From (50) 


o20,—0,0, = 0 
or aeyo, — oz si G,—9,0,)o, = 0 
or G,+0,0, = 0 
with the help of the first of esl (49). Thismeanso,o¢, = —o, 0,. 


Two dynamical variables or linear operators like these wikiols satisfy 
the commutative law of multiplication except for a minus sign will 
be said to anticommute. Thus co, anticommutes with o,. From sym- 
metry each of the three dynamical variables o,, o,, ¢, must anti- 
commute with any other. Equations (49) may now be written 


Gy 0, = 10, = —O,5y, 
0,0, = 10, = —Gz%, (51) 
O,0y = 10, = —OyGz, 

and also from (50) 0,00, = 1. (52) 


3595.57 L 
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Equations (50), (51), (52) are the fundamental equations satisfied by 
the spin variables o describing a spin whose magnitude is 3%. 

Let us set up a matrix representation for the o’s and let us take o, 
to be diagonal. If there are no other independent dynamical variables 
besides the m’s or o’s in our dynamical system, then o, by itself forms 
a complete set of commuting observables, since the form of equations 
(50) and (51) is such that we cannot construct out of o,, o,, and o, 
any new dynamical variable that commutes with o,. The diagonal 
elements of the matrix representing o, being the eigenvalues 1 and 
—1 of a,, the matrix itself will be 


1 0 

> —} 
et a, be represented by é 

a, 


This matrix must be Hermitian, so that a, and a, must be real and 
a, and a3 conjugate complex numbers. The equation o,0, = —o,¢, 


gives us 
| a) MI hl 
—d, —a@ ag —d,)’ 


so that a, = a, = 0. Hence a, is represented by a matrix of the form 


0 a 

ay 50) 
The equation o2 = 1 now shows that a,a, = 1. Thus a, and ag, being 
conjugate complex numbers, must be of the form e*” and e-* re- 


spectively, where « is a real number, so that o, is represented by a 
matrix of the form ( 0 7 


ete Oy 
Similarly it may be shown that o, is also represented by a matrix of 
this form. By suitably choosing the phase factors in the representa- 


tion, which is not completely determined by the condition that oy 
shall be diagonal, we can arrange that oc, shall be represented by the 


matrix ( ) 


1 Of 
The representative of o, is then determined by the equation 


oy = io,0,. We thus obtain finally the three matrices 


(to) Go} fo ap = 
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to represent o,, a,, and o, respectively, which matrices satisfy all the 
algebraic relations (49), (50), (51), (52). The component of the vector 
6 in an arbitrary direction specified by the direction cosines |, m, n, 
namely lo,-+moa,-+no,, is represented by 


n l—im 

~— —n ) (54) 
The representative of a ket vector will consist of just two numbers, 
corresponding to the two values +1 and —1 for o,. These two num- 
bers form a function of the variable o, whose domain consists of only 
the two points +1 and —1. The state for which o, has the value unity 
will be represented by the function, f,(c,) say, consisting of the pair 
of numbers 1, 0 and that for which o, has the value —1 will be 
represented by the function, fg(o,) say, consisting of the pair 0, 1. 
Any function of the variable oj, ie. any pair of numbers, can be 
expressed as a linear combination of these two. Thus any state can 
be obtained by superposition of the two states for which o, equals +1 and 
—1 respectively. For example, the state for which the component of 
o in the direction 1, m, n, represented by (54), has the value +1 is 

represented by the pair of numbers a, 6 which satisfy : 


n Il—wim\fa\ _ [a 
I+im —n]\b} \b 
or na+(l—im)b = a, 
(Ltim)a—nb = b. 
a l—im ltn 
a a 
This state can be regarded as a superposition of the two states for 
which o, equals +1 and —1, the relative weights in the superposition 
process being as 
ja|?: |b|? = [l—tm|*: (l—n)? = 1+”: 1—n. (55) 
For the complete description of an electron (or other elementary 
particle with spin 4%) we require the spin dynamical variables o, 
whose connexion with the spin angular momentum is given by (48), 
together with the Cartesian coordinates z, y, z and momenta p,, Py, 
p, The spin dynamical variables commute with these coordinates 
and momenta. Thus a complete set of commuting observables for a 
system consisting of a single electron will be x, y, z, o,. In a repre- 
sentation in which these are diagonal, the representative of any state 
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will be a function of four variables 2’, y’, 2’, o,. Since o, has a domain 
consisting of only two points, namely 1 and —1, this function of four 
variables is the same as two functions of three variables, namely the 
two functions 


<a'y’2 De a , a 2) 1 >, <x'y’z'|>_ = ke ie Z',— 1 [>- (56) 
Thus the presence of the spin may be considered either as introducing a 


new variable into the representative of a state or as giving this representa- 
tive two components. 


38. Motion in a central field of force 

An atom consists of a massive positively charged nucleus together 
with a number of electrons moving round, under the influence of the 
attractive force of the nucleus and their own mutual repulsions. An 
exact treatment of this dynamical system is a very difficult mathe- 
matical problem. One can, however, gain some insight into the main 
features of the system by making the rough approximation of regard- 
ing each electron as moving independently in a certain central field 
of force, namely that of the nucleus, assumed fixed, together with 
some kind of average of the forces due to the other electrons. Tiius 
our present problem of the motion of a particle in a central field of 
force forms a corner-stone in the theory of the atom. 

Let the Cartesian coordinates of the particle, referred to a system 
of axes with the centre of force as origin, be x, y, z and the corre- 
sponding components of momentum p,, p,, p,. The Hamiltonian, 
with neglect of relativistic mechanics, will be of the form 


H = 1/2m.(pi+ py +e) +V, (57) 
where JV, the potential energy, is a function only of (x?+-y?+-2?). To 
develop the theory it is convenient to introduce polar dynamical 

variables. We introduce first the radius r, defined as the positive 
square root r= (w2-Ly?-+22)h, . 
Its eigenvalues go from 0 tooo. If we evaluate its P.B.s with p,, p,, 
and p,, we obtain, with the help of formula (32) of § 22, 


xz 


or z 
rel=s=s nml=t, nel==, 


r r 


the same as in the classical theory. We introduce also the dynamical 
variable p, defined by 


Dy = 1-(xp,t+-YPy+2Pz)- (58 
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Its P.B. with r is given by 
r[r, D,| = rp,| = [ems “p,+yp, +2zp,] 


= a[r, po)]+ylr, py]+2[7, 2] 
= 2.a/r+y.y/r+z.2fr =r. 
Hence [ogo] = 
or T),—P,r = th. 


The commutation relation between 7 and , is just the one for a 
canonical coordinate and momentum, namely equation (10) of § 22. 
This makes :, like the momentum conjugate to the r coordinate, but 
it is not exactly equal to this momentum because it is not veal, its 
conjugate complex being 


Br = (P2t+p,y+p,2)r* = (wp, +ypy+ep,—3ih)r 
== (rp,—3ih)r-! = p,—2ihr-1. (59) 
Thus p,—ifr- is real and is the true momentum conjugate to r. 
The angular momentum m of the particle about the origin is given 
by (22) and its magnitude k is given by (39). Since r and p, are 
scalars, they commute with m, and therefore also with k. 
We can express the Hamiltonian in terms of r, p,, and k. We have, 


if } denotes a sum over cyclic permutations of the suffixes a, y, z, 
YZ 


k(k+h) = 2 m2 = 2, (xp y—YPz)* 
= 3 (Ly XPy+YP2 YP2—XPy YPx— YP tPy) 


= 3 (a2p2+-y2p2—ap, Dy Y¥—YPy Px C+ Hp —axp, Pet — 
— 2hxp,) 
= (2+y?+2°)(p2+-p}+p2)— 
— (xp, +YyPy+2Pz)(P2t+Py Y +P, 2+ 21h) 
== 7°(p2-+ p? + p2)—1p,(P, 7 + 21h) 
= (p+ pi + P2)—T Per. 
from (59). Hence 


] re) Bey. (60) 


1 
H= sal Pert 


This form for H is such that k commutes not only with H, as is 
necessary since & is a constant of the motion, but also with every 
dynamical variable occurring in H, namely r, p,, and V, which is a 
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function of r. In consequence, a simple treatment becomes possible, 
namely, we may consider an eigenstate of k belonging to an eigen- 
value k’ and then we can substitute k’ for k in (60) and get a problem 
in one degree of freedom r. 

Let us introduce Schrédinger’s representation with «, y, 2 diagonal. 
Then p,, p,, p, are equal to the operators —ih 0/@x, —ih d/ay, —th d/dz 
respectively. A state is represented by a wave function #(xyzt) satis- 
fying Schrédinger’s wave equation (7) of § 27, which now reads, with 
HT given by (57), 


ine — {- ale os at aa|+ vy. (61) 


We may pass from the Cartesian coordinates x,y,z to the polar 
coordinates r,0,¢ by means of the equations 
| x = rsin6cos¢, 
y = rsin@sin¢, (62) 
z=rcos80, 
and may express the wave function in terms of the polar coordinates, 
so that it reads ¥:(70¢t). The equations (62) give the operator equation 
7) Ge oO Oy od . G20 we 7) 
ar er Ox eat ae ee 
which shows, on being compared with (58), that p, = —ikd/ér. Thus 
Schrédinger’s wave equation reads, with the form (60) for H, 


ob {R(_1 2 | beh) 
Tae _ 


Here & is a certain linear operator which, since it commutes with r 
and 0/ér, can involve only 6, ¢, 2/20, and 0/846. From the formula 
k(k-+h) = m3+-m3 +m, (64) 
which comes from (39), and from (62) one can work out the form of 
k(k-+-h) and one finds 
“(ke-+h) ee 
iz sin 20 sin 05 sin? ag?" - 
This operator is well known in mathematical physics. Its eigen- 
functions are called spherical harmonics and its eigenvalues are 
n(n+1) where n is an integer. Thus the theory of spherical har- 


monics, provides an alternative proof that the eigenvalues of k are 
integral multiples of h. 
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For an eigenstate of k belonging to the eigenvalue nf (n a non- 
negative integer) the wave function will be of the form 

b = rty(rt)S,,(6$), 66 

where S,,(@¢) satisfies . al 

k(k+h)S,,(0¢) = n(n+ 1)h?2S,,(09), (67) 

ie. from (65) S,, is a spherical harmonic of order n. The factor r-} 

is inserted in (66) for convenience. Substituting (66) into (63), we 

get as the equation for y 


0x [Rf e  n(nt+) 

nm (F(- Seer) evs (68) 
If the state is a stationary state belonging to the energy value H’, 
x will be of the form 


x(rt) = xo(r)ete 
and (68) will reduce to 
2 2 
A’ x = (E(t) + xe (69) 

This equation may be used to determine the energy-levels H’ cf the 
system. For each solution x, of (69), arising from a given n, there 
will be 2n-+1 independent states, because there are 2n+1 indepen- 
dent solutions of (67) corresponding to the 2n+1 different values 
that a component of the angular momentum, m, say, can take on. 

The probability of the particle being in an element of volume 
dxdyaz is proportional to is |?dadydz. With of the form (66) this 
becomes r~*|x|2|S,,|2dxdydz. The probability of the particle being in 
a spherical shell between r and r+dr is then proportional to |x|?dr. 
It now becomes clear that, in solving equation (68) or (69), we must 
impose a boundary condition on the function y at r = 0, namely the 
function must be such that the integral to the origin | lx|? dr is 

1) 


convergent. If this integral were not convergent, the wave function 
would represent a state for which the chances are infinitely in favour 
of the particle being at the origin and such a state would not be 
physically admissible. 

The boundary condition at r = 0 obtained by the above considera- 
tion of probabilities is, however, not sufficiently stringent. We get a 
more stringent condition by verifying that the wave function obtained 
by solving the wave equation in polar coordinates (63) really satisfies 
the wave equation in Cartesian coordinates (61). Let us take the case 
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of V = 0, giving us the problem of the free particle. Applied to a 
stationary state with energy H’ = 0, equation (61) gives 


Vay = 0, (70) 


where V? is written for the Laplacian operator 0?/dx?-+ 0?/@y?+ 67/é2?, 
and equation (63) gives 
1a k(k+h)\, _ 

( a Fy? ib = (0) (71) 
A solution of (71) for k=0 is %=r-1. This does not satisfy 
(70), since, although V2r-! vanishes for any finite value of 7, its integral 
through a volume containing the origin is —4m (as may be verified 
by transforming this volume integral to a surface integral by means 
of Gauss’s theorem), and hence 


V2p-l = —4ar 8(22)8(y)8(2). (72) 


Thus not every solution of (71) gives a solution of (70), and more 
generally, not every solution of (63) is a solution of (61). We must 
impose on the solution of (63) the condition that it shall not tend to 
infinity as rapidly as r-! when r > 0 in order that, when substituted 
into (61), it shall not give a 6 function on the right like the right-hand 
side of (72). Only when equation (63) is supplemented with this condi- 
tion does it become equivalent to equation (61). We thus have the 
boundary condition rs > 0 or y >0asr>0. 

There are also boundary conditions for the wave function at r = 0. 
If we are interested only in ‘closed’ states, i.e. states for which the 
particle does not go off to infinity, we must restrict the integral to 


a 

infinity J \x(7)|* dr to be convergent. These closed states, however, 
are not the only ones that are physically permissible, as we can also 
have states in which the particle arrives from infinity, is scattered 
by the central field of force, and goes off to infinity again. For these 
states the wave function may remain finite as r > oo. Such states will 
be dealt with in Chapter VIII under the heading of collision problems. 
In any case the wave function must not tend to infinity as r > 00, or 
it will represent a state that has no physical meaning. 


39. Energy-levels of the hydrogen atom 
The above analysis may be applied to the problem of the =—— 
atom with neglect of relativistic mechanics and the spin of the 
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electron. The potential energy V is nowt —e?/r, so that equation 
(69) becomes 
dad n(n+1) , 2me? 1 2mH’ 
Wet -|xo = ape koe (73) 
A thorough investigation of this equation has been given by Schro- 
dinger.{ We shall here obtain its eigenvalues H’ by an elementary 


argument. 
It is convenient to put 


Xo = finer, (74) 
introducing the new function f(r), where a is one or other of the 
square roots a = +, (—#/2mH’). (75) 
Equation (73) now becomes | 

(ana ena |) = 0. (76) 
We look for a solution of this equation in the form of a power series 
f(r) = S C7, (77) 


in which consecutive values for s differ by unity although these 
values themselves need not be integers. On substituting (77) in (76) 
we obtain 

> c,{s(s—1)r?-#— (28/a)r°-1 —n(n + 1)r*-* + (2me?/h?)r>-*} = 0, 

8 
which gives, on equating to zero the coefficient of r*-®, the following 
relation between successive coefficients c,, 


c,[s(e—1)—n(n-+1)] = ¢,-,[2(s—1)/a—2me?/h?}. (78) 


We saw in the preceding section that only those eigenfunctions x 
are allowed that tend to zero with r and hence, from (74), f(r) must 
tend to zero with r. The series (77) must therefore terminate on the 
side of small s and the minimum value of s must be greater than zero. 
Now the only possible minimum values of s are those that make the 
coefficient of c, in (78) vanish, ie. n+1 and —n, and the second 
of these is negative or zero. Thus the minimum value of s must be 
n-+1. Since n is always an integer, the values of s will all be integers. 

+ The e here, denoting minus the charge on an electron, is, of course, to be dis- 


tinguished from the e denoting the base of exponentials. 
$ Schrédinger, Ann. d. Physik, 79 ( 1926), 361. 
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The series (77) will in general extend to infinity on the side of large s. 
For large values of s the ratio of successive terms is 


according to (78). Thus the series (77) will always converge, as the 
ratios of the higher terms to one another are the same as for the 


series 
1 /2r\8 


& 
which converges to e?/2, 

We must now examine how our solution yo behaves for large 
values of r. We must distinguish between the two cases of H’ positive 
and H’ negative. For H’ negative, a given by (75) will be real. Sup- 
pose we take the positive value for a. Then as r +o the sum of the 
series (77) will tend to infinity according to the same law as the sum 
of the series (79), i.e. the law e?/¢, Thus, from (74), xy) will tend to 
infinity according to the law e”@ and will not represent a physically 
possible state. There is therefore in general no permissible solution 
of (73) for negative values of H’. An exception arises, however, when- 
ever the series (77) terminates on the side of large s, in which case the 
boundary conditions are all satisfied. The condition for this termina- 
tion of the series is that the coefficient of c,_, in (78) shall vanish for 
some value of the suffix s—1 not less than its minimum value n+1, 
which is the same as the condition that 


for some integer s not less than n+1. With the help of (75) this 


condition becomes a 
me 


~ aF 
and is thus a condition for the energy-level H’. Since s may be any 
positive integer, the formula (80) gives a discrete set of negative | 
energy-levels for the hydrogen atom. These are in agreement with 
experiment. For each of them (except the lowest one s = 1) there 
are several independent states, as there are various possible values 
for n, namely any positive or zero integer less than s. This multi- 
plicity of states belonging to an energy-level is in addition to that 
mentioned in the preceding section arising from the various possible 


—— (80) 


e 
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values for a component of angular momentum, which latter multi- 
plicity occurs with any central field of force. The x multiplicity occurs 
only with an inverse square law of force and even then is removed 
when one takes relativistic mechanics into account, as will be found 
in Chapter XI. The solution x, of (73) when H’ satisties (80) tends to 
zero exponentially as 7 > co and thus represents a closed state (corre- 
sponding to an elliptic orbit in Bohr’s theory). 

For any positive values of H’, a given by (75) will be pure imaginary. 
The series (77), which is like the series (79) for large 7, will now have a 
sum that remains finite as7—> oo. Thus ypgiven by (74) willnow remain 
finite as roo and will therefore be a permissible solution of (73), 
giving a wave function that tends to zero according to the law r-1 as 
7 —>co. Hence in addition to the discrete set of negative energy-levels 
(80), all positive energy-levels are allowed. The states of positive 


energy are not closed, since for them the integral to infinity i lxol2 dr 
does not converge. (These states correspond to the hyperbolic orbits 
of Bohr’s theory.) 


40. Selection rules 

If a dynamical system is set up in a certain stationary state, it will 
remain in that stationary state so long as it is not acted upon by 
outside forces. Any atomic system in practice, however, frequently 
gets acted upon by external electromagnetic fields, under whose 
influence it is liable to cease to be in one stationary state and to make 
a transition to another. The theory of such transitions will be de- 
veloped in §§ 44and 45. A result of this theory is that, toa high degree 
of accuracy, transitions between two states cannot occur under the 
influence of electromagnetic radiation if, in a Heisenberg representa- 
tion with these two stationary states as two of the basic states, the 
matrix element, referring to these two states, of the representative 
of the total electric displacement D of the system vanishes. Now it 
happens for many atomic systems that the great majority of the 
matrix elements of D in a Heisenberg representation do vanish, and 
hence there are severe limitations on the possibilities for transitions. 
The rules that express these limitations are called selection rules. 

The idea of selection rules can be refined by a more detailed 
application of the theory of §§44 and 45, according to which 
the matrix elements of the different Cartesian components of the 
vector D are associated with different states of polarization of the 
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electromagnetic radiation. The nature of this association is just what 
one would get if one considered the matrix elements, or rather their 
real parts, as the amplitudes of harmonic oscillators which interact 
with the field of radiation according to classical electrodynamics. 
There is a general method for obtaining all selection rules, as 
follows. Let us call the constants of the motion which are diagonal in 
the Heisenberg representation «’s and let D be one of the Cartesian 
components of D. We must obtain an algebraic equation connecting 
D and the a’s which does not involve any dynamical variables other 
than D and the «’s and which is linear in D. Such an equation will 


be of the form ¥ f,Dg, = 0, (81) 
rT 


where the f,’s and g,’s are functions of the a’s only. If this equation 
is expressed in terms of representatives, it gives us 


D filo! Kal |D a">a,(o) = 0, 


or (a Dia"> ¥ fula')ap(a") = 0, 
which shows that <«’|D|«”> = 0 unless 
> fela’)gn(a") = 0. (82) 


This last equation, giving the connexion which must exist between 
a’ and a” in order that <a’|D|x”> may not vanish, constitutes the 
selection rule, so far as the component D of D is concerned. 

Our work on the harmonic oscillator in § 34 provides an exampie 
of a selection rule. Equation (8) is of the form (81) with 7 for D and 
H playing the part of the «’s, and it shows that the matrix elements 
<H'|7|H"> of 7 all vanish except those for which H”—H’ = hw. The 
conjugate complex of this result is that the matrix elements (H’|»|H”> 
of 7 all vanish except those for which H”’—H’ = —fiw. Since q isa 
numerical multiple of 7—7, its matrix elements ¢H'|q|H"» all vanish 
except those for which H”—H' = +fw. If the harmonic oscillator 
carries an electric charge, its electric displacement D will be pro- 
portional to g. The selection rule is then that only those transitions 
can take place in which the energy H changes by a single quan- 
tum hw. 

We shall now obtain the selection rules for m, and k for an electron 
moving in a central field of force. The components. of electric dis- 
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placement are here proportional to the Cartesian coordinates 2, y, z. 

Taking first m,, we have that m, commutes with z, or that 
mM,z—zm, — 0. 


This is an equation of the required type (81), giving us the selection 
rule 


m,—m, = 0 
for the z-component of the displacement. Again, from equations 
(23) we have 
[m,, [m,, «]] = [m,, y] = —2 

or m2x—2m,xm,+2m2—he = 0, 
which is also of the type (81) and gives us the selection rule 

m,2—2m,m,+m,—h? = 0 
or (m,—m,—h)(m,—m,+h) = 0 
for the x-component of the displacement. The selection rule for the 
y-component is the same. Thus our selection rules for m, are that 
in transitions associated with radiation with a polarization corresponding 
to an electric dipole in the z-direction, m, cannot change, while in transi- 
tions associated with a polarization corresponding to an electric dipole 
in the x-direction or y-direction, m, must change by +h. 

We can determine more accurately the state of polarization of the 
radiation associated with a transition in which m, changes by -Lh, by 
considering the condition for the non-vanishing of matrix elements 
of x+iy and z—iy. We have 

[m,,2+iy] = y—ix = —i(z+1y) 
or m,(x-+-iy)—(x+iy)(m,+h) = 0, 
which is again of the type (81). It gives 
; m,—m,—h = 0 
as the condition that <mi,|x+iy|mj> shall not vanish. Similarly, 
m,—m,+h = 0 
is the condition that <m/,|z—iy|m;> shall not vanish. Hence 


<m,|2—iy|m,—h> = 0 


or <mi,|a\m,—h> = i<m,|y|m,—h) = (a+2b)e 
say, a, b, and w being real. The conjugate complex of this is 
(m,—h\x|\m,> = —i<m,—hly|m,> = (a—ib)e—*", 


Thus the vector 4{<m,|D|m,—h> +<m,—h|D|m,>}, which determines 
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the state of polarization of the radiation associated with transitions 
for which m’, = m,—h, has the following three components 
4{<m,|x|m,—h)> + <m,—h\x|m,y} 
= L(a+ib)e+ (a—ib)e*} = a cos wt—bsin wt, 
4{<m,|y|m,—h) + <m,—hily|\m,>} } (83) 
= dif{—(a+ib)e+ (a—ib)e*} = asin wt-+b cos wt, 
3{<m,|z|m,—h) + (m,—h\z\m,y} = 0. 
From the form of these components we see that the associated radia- 
tion moving in the z-direction will be circularly polarized, that 
moving in any direction in the xy-plane will be linearly polarized in 
this plane, and that moving in intermediate directions will be 
elliptically polarized. The direction of circular polarization for radia- 
tion moving in the z-direction will depend on whether w is positive 
or negative, and this will depend on which of the two states m, or 
mi, = m,—h has the greater energy. 
We shall now determine the selection rule for k. We have 


[(k-+1), 2] = [8,2] +[m8, 2] 
= —YM,—M,Y+LM, +My, x 
= 2(m,x—m,y+tiz) 
= 2(m,x—ym,) = 2(xm,—m,y). 


Similarly, [k(k-+%), x] = 2(ym,—m, z) 
and [k(k+h), y] = 2(m,z2—am,). 
Hence 


[k(e-+i), [e(k-+%), 2] 
= 2[k(k+h),m,x—m, y+thz] 
2m,[k(k-+h), x]—2m,[k(k+h), y]+ 2a, k(k+h), z] 
= 4m,(ym,—m, z)—4m,(m,z—xm,) + 2{k(k+-h)z--zk(k-+h)} 


= 4(m,z+m,y+m,z)m,—4(m2+- mie + m2)z+ Jf 
+ 2{k(k-+h)ze—zk(k+h)}. 
From (22) M,x+mM,Yy+m,z = 0 (84) 
and hence 


[k(k-+-h), [k(k+h), 2]] = —2{k(k+h)z+ek(k+hy}, 
which gives 
k2(k+h)?2z—2k(k+h)ek(k-+-h)+2k3(k+h)?— 
— 2h {k(k+h)z+z2k(k+h)} = 0. (85) 
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Similar equations hold for x and y. These equations are of the re- 
quired type (81), and give us the selection rule 

K'2(k’ + h)*— 2k’ (hk +A)k (ke +h) +h" 2k" +h)2— 

— 2h7k'(k' +h)— 2k" (k” +h) = 0, 
which reduces to 
(k’ +k" + 2h)(k' +k )(k’ —k" +h)(k’ —k’"—h) = 0. 
A transition can take place between two states k’ and k” only if one 
of these four factors vanishes. 

Now the first of the factors, (k’+-k”+ 2h), can never vanish, since 
the eigenvalues of k are all positive or zero. The second, (k’+4"), can 
vanish only if k’ = 0andk” = 0. But transitions between two states 
with these values for k cannot occur on account of other selection 
rules, as may be seen from the following argument. If two states 
(labelled respectively with a single prime and a double prime) are 
such that k’ = 0 and k” = 0, then from (41) and the corresponding 
results for m, and m,, m,, = m, = m, = 0 and m, = m, = m, = 0. 
The selection rule for m, now shows that the matrix elements of 
x and y referring to the two states must vanish, as the value of m, 
does not change during the transition, and the similar selection rule 
for m, or m, shows that the matrix element of z also vanishes. Thus 
transitions between the two states cannot occur. Our selection rule 
for k now reduces to 

(k’—k" +h) (k’—k"—h) = 0, 

showing that k must change by +h. This selection rule may be written 

k’2_Qkh’k" +k?—1 = 0, 
and since this is the condition that a matrix element <k’|z|k”> shall 
not vanish, we get the equation 

k?z—2kzek+2kh*—h2z = 0 
or [k, [k, z|] = te (86) 
a result which could not easily be obtained in a more direct way. 

As a final example we shall obtain the selection rule for the magni- 
tude K of the total angular momentum M of a general atomic system. 
Let x,y,z be.the coordinates of one of the electrons. We must obtain 
the condition that the (K’, K”) matrix element of z, y, or z shall not 
vanish. This is evidently the same as the condition that the (K’, K") 
matrix element of A,, Az, or A; shall not vanish, where ,, Ap, and A; 
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are any three independent linear functions of x, y, and z with numeri- 
cal coefficients, or more generally with any coefficients that commute 
with K and are thus represented by matrices which are diagonal with 


respect to K. Let _ Ay = M,2+M,y+M,z, 
: A, = M,z—M,y—ihea, 
A, = M,2—M,2—thy, 
A, = M,y—M,x—thz. 
We have 
M,A,+M,ry+M,A, =) 0 Me — Ma?) 


=> (4, M,—M,M,—ihl,z = 0 (87) 
rye 


from (29). Thus A,, A,, and A, are not linearly independent functions 
of x, y, and z. Any two of them, however, together with A, are three 
linearly independent functions of x, y, and z and may be taken as the 
above Aj, Ag, Ag, since the coefficients M,, M,, M, all commute with K. 
Our problem thus reduces to finding the condition that the (K’, K”) 
matrix elements of Ao, A,, A,, and A, shall not vanish. The physical 
meanings of these 4’s are that A, is proportional to the component of 
the vector (x,y,z) in the direction of the vector M, and X,, A,, A, are 
proportional to the Cartesian components of the component of (2, y, 2) 
perpendicular to M. 

Since A, is a scalar it must commute with K. It follows that only 
the diagonal elements\<K’|Aj|K’> of Aj can differ from zero, so the 
selection rule is that K cannot change so far as Ay is concerned. Apply- 
ing (30) to the vector A,,A,,A,, we have 


; [,, Az] = ry (24, Ay] = —A,, [M,, r,] =i) 
These relations between WV, and A,,A,,A, are of exactly the same form 
as the relations (23), (24) between m, and x,y,z, and also (87) is of 
the same form as (84). The dynamical variables 4,, A,, A, thus have the | 
same properties relative to the angular momentum M as z, y»z have 
relative to m. The deduction of the selection rule for k when the 
electric displacement is proportional to (x,y,z) can therefore be taken 
over and applied to the selection rule for K when the electric displace- 
ment is proportional to (A,,A,,A,). We find in this way that, so far as 
A, Ay, A, are concerned, the selection rule for K is that it must change 
by +h. 

Collecting results, we have as the selection rule for K that it must 
change by 0 or +%. We have considered the electric displacement 
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produced by only one of the electrons, but the same selection rule 
must hold for each electron and thus also for the total electric dis- 
placement. 


41. The Zeeman effect for the hydrogen atom 

We shall now consider the system of a hydrogen atom in a uniform 
magnetic field. The Hamiltonian (57) with V — —e?/r, which describes 
the hydrogen atom in no external field, gets modified by the magnetic 
field, the modification, according to classical mechanics, consisting 
in the replacement of the components of momentum, p,, p,, p,, by 
petele.A,, Pyte/c.A,, p,+e/c.A,, where A,, A,, A, are the com- 
ponents of the vector potential describing the field. For a uniform 
field of magnitude # in the direction of the z-axis we may take 
A, = —}Fy, A, = }#«, A, = 0. The classical Hamiltonian will 
then be 

2 2 2 
i= al (= 5 59) + (Py +5 £4) +1|—5. 
This classical Hamiltonian may be taken over into the quantum 
theory if we add on to it a term giving the effect of the spin of the 
electron. According to experimental evidence and according to the 
theory of Chapter XI, the electron hasa magnetic moment — eh/2mc.e, 
where ois the spin vector of § 37. The energy of this magnetic moment 
in the magnetic field will be eh. A‘/2mc.c,. Thus the total quantum 
Hamiltonian will be 
2 2 2 

H= sal 52%4) +(r+5 sa) +21} "424, (88) 
There ought strictly to be other terms in this Hamiltonian giving the 
interaction of the magnetic moment of the electron with the electric 
field of the nucleus of the atom, but this effect is small, of the same 
order of magnitude as the correction one gets by taking relativistic 
mechanics into account, and will be neglected here. It will be taken 
into account in the relativistic theory of the electron given in 
Chapter XI. 

If the magnetic field is not too large, we can neglect terms involving 
J#2, so that the Hamiltonian (88) reduces to 


l e ef = eh 
dita am et Py TPs) —— +5 (ey YP2) + 57. % 


] 2) e#& 
= 5 (pit Ph +p?) —— +5 (m, + he,). ie 


3595.57 M 
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The extra terms due to the magnetic field are now e/#/2mc. (m,+he,). 
But these extra terms commute with the total Hamiltonian and are 
thus constants of the motion. This makes the problem very easy. 
The stationary states of the system, i.e. the eigenstates of the Hamil- 
tonian (89), will be those eigenstates of the Hamiltonian for no field 
that are simultaneously eigenstates of the observables m, and o,, Or 
at least of the one observable m,-+/ic,, and the energy-levels of the 
system will be those for the system with no field, given by (80) if 
one considers only closed states, increased by an eigenvalue of 
e#/2me.(m,+he,). Thus stationary states of the system with no 
field for which m, has the numerical value m,, an integral multiple 
of %, and for which also o, has the numerical value o, = +1, will still 
be stationary states when the field is applied. Their energy will be 
increased by an amount consisting of the sum of two parts, a part 
e#t/2mc.m', arising from the orbital motion, which part may be con- 
sidered as due to an orbital magnetic moment —em,/2mc, and a part 
e#/2mc.ho', arising from the spin. The ratio of the orbital magnetic 
moment to the orbital angular momentum m, is —e/2mc, which is 
half the ratio’ of the spin magnetic moment to the spin angular 
momentum. This fact is sometimes referred to as the magnetic 
anomaly of the spin. 

Since the energy-levels now involve m,, the selection rule for m, 
obtained in the preceding section becomes capable of direct com- 
parison with experiment. We take a Heisenberg representation in 
which, among other constants of the motion, m, and o, are diagonal. 
The selection rule for m, now requires m, to change by h, 0, or —hi, 
while o,, since it commutes with the electric displacement, will not 
change at all. Thus the energy difference between the two states 
taking part in the transition process will differ by an amount 
eh H/2mc, 0, or —ehA#/2mc from its value for no magnetic field. 
Hence, from Bohr’s frequency condition, the frequency of the 
associated electromagnetic radiation will differ by eA'/4ame, 0, or 
—eH#/4rmce from that for no magnetic field. This means that each 
spectral line for no magnetic field gets split up by the field into three 
components. If one considers radiation moving in the z-direction, 
then from (83) the two outer components will be circularly polarized, 
while the central undisplaced one will be of zero intensity. These 
results are in agreement with experiment and also with the classical 
theory of the Zeeman effect. 


VII 
PERTURBATION THEORY 


42. General remarks 
In the preceding chapter exact treatments were given of some simple 
dynamical systems in the quantum theory. Most quantum problems, 
however, cannot be solved exactly with the present resources of 
mathematics, as they lead to equations whose solutions cannot be 
expressed in finite terms with the help of the ordinary functions of 
analysis. For such problems one can often use a perturbation method. 
This consists in splitting up the Hamiltonian into two parts, one of 
which must be simple and the other small. The first part may then 
be considered as the Hamiltonian of a simplified or unperturbed 
system, which can be dealt with exactly, and the addition of the 
second will then require small corrections, of the nature of a perturba- 
tion, in the solution for the unperturbed system. The requirement 
that the first part shall be simple requires in practice that it shall not 
involve the time explicitly. If the second part contains a small 
numerical factor «, we can obtain the solution of our equations for 
the perturbed system in the form of a power series in e, which, pro- 
vided it converges, will give the answer to our problem with any 
desired accuracy. Even when the series does not converge, the first 
approximation obtained by means of it is usually fairly accurate. 
There are two distinct methods in perturbation theory. In one of 
these the perturbation is considered as causing a modification of the 
states of motion of the unperturbed system. In the other we do not 
consider any modification to be made in the states of the unperturbed 
system, but we suppose that the perturbed system, instead of remain- 
ing permanently in one of these states, is continually changing from 
one to another, or making transitions, under the influence of the 
perturbation. Which method is to be used in any particular case 
depends on the nature of the problem to be solved. The first method 
is useful usually only when the perturbing energy (the correction in the 
Hamiltonian for the undisturbed system) does not involve the time 
explicitly, and is then applied to the stationary states. It can be used 
‘for calculating things that do not refer to any definite time, such as 
the energy-levels of the stationary states of the perturbed system, or, 
in the case of collision problems, the probability of scattering through 
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a given angle. The second method must, on the other hand, be used 
for solving all problems involving a consideration of time, such as 
those about the transient phenomena that occur when the perturba- 
tion is suddenly applied, or more generally problems in which the 
perturbation varies with the time in any way (i.e. in which the per- 
turbing energy involves the time explicitly). Again, this second 
method must be used in collision problems, even though the per- 
turbing energy does not here involve the time explicitly, if one 
wishes to calculate absorption and emission probabilities, since these 
probabilities, unlike a scattering probability, cannot be defined with- 
out reference to a state of affairs that varies with the time. 
One can summarize the distinctive features of the two methods by 
saying that, with the first method, one compares the stationary states 
of the perturbed system with those of the unperturbed system; with 
the second method one takes a stationary state of the unperturbed 


system and sees how it varies with time under the influence of the 
perturbation. 


43. The change in the energy-levels caused by a perturbation 
The first of the above-mentioned methods will now be applied to 
the calculation of the changes in the energy-levels of a system caused 
by a perturbation. We assume the perturbing energy, like the Hamil- 
tonian for the unperturbed system, not to involve the time explicitly. 
Our problem has a meaning, of course, only provided the energy-levels 
of the unperturbed system are discrete and the differences between 
them are large compared with the changes in them caused by the 
perturbation. This circumstance results in the treatment of perturba- 
tion problems by the first method having some different features 
according to whether the energy-levels of the unperturbed system are 
discrete or continuous. 
Let the Hamiltonian of the perturbed system be 
H= E-+YS, (1) 
E being the Hamiltonian of the unperturbed system and V the small 
perturbing energy. By hypothesis each eigenvalue H’ of H lies very 
close to one and only one eigenvalue £’ of ZH. We shall use the same 
number of primes to specify any eigenvalue of H and the eigenvalue 
of E to which it lies very close. Thus we shall have H” differing from 
E" by a small quantity of order V and differing from E’ by a quantity 
that is not small unless H’ = E”. We must now take care always to 
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use different numbers of primes to specify eigenvalues of H and E 
which we do not want to lie very close together. 
To obtain the eigenvalues of H, we have to solve the equation 
He) =" |H"> 
or (H’— £)|A’> = V\H’>. (2) 
Let |0> be an eigenket of E belonging to the eigenvalue EZ’ and 
suppose the |H’> and H’ that satisfy (2) to differ from |0> and E’ 
only by small quantities and to be expressed as 


|Z" = |0>+|1>+|2>+..., 
ome (3) 
= E'+a,+a,+..., 
where |1> and a, are of the first order of smallness (i.e. the same order 


as V), |2> and a, are of the second order, and so on. Substituting 
these expressions in (2), we obtain 


a Payeay-...}{/0>4-/1)49)2)+...} = V{j0>+]1)-+...}. 
If we now separate the terms of zero order, of the first order, of the 
second order, and so on, we get the following set of equations, 


(Z'— E)|0> = 0, 
(E’— E)|1>+4,|0> = V0), (4) 
(£’— E)|2)+a,|1>+a,|0> = VI, 


The first of these equations tells us, what we have already assumed, 
that |0> is an eigenket of H belonging to the eigenvalue HZ’. The others 
enable us to calculate the various corrections |1), |2),..., @,@.,.... 

For the further discussion of these equations it is convenient to 
introduce a representation in which F is diagonal, i.e. a Heisenberg 
representation for the unperturbed system, and to take # itself as 
one of the observables whose eigenvalues label the representatives. 
Let the others, in the event of others being necessary, as is the case 
when there is more than one eigenstate of H belonging to any eigen- 
value, be called f’s. A basic bra is then <H’f"|. Since |0> is an 
eigenket of HZ belonging to the eigenvalue H’, we have 

(E"B"|0> = Syeef(B"), (5) 

where f(f”) is some function of the variables 6”. With the help of this 
result the second of equations (4), written in terms of representatives, 
becomes 


(E’— B")< B"B'|1) +4; Sere f(B") = » CEB" V|E'BDF(B'). (8) 
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Putting H” = EK’ here, we get 
a, f(B") = » CE'B'V | E'B> f(B’). (7) 


Equation (7) is of the form of the standard equation in the theory 
of eigenvalues, so far as the variables p’ are concerned. It shows that 
the various possible values for a, are the eigenvalues of the matrix 
<E'p"|\V|E'B’>. This matrix is a part of the representative of the 
perturbing energy in the Heisenberg representation for the unper- 
turbed system, namely, the part consisting of those elements that 
refer to the same unperturbed energy-level H’ for their row and 
column. Each of these values for a, gives, to the first order, an energy- 
level of the perturbed system lying close to the energy-level H’ of the 
unperturbed system.} There may thus be several energy-levels of the 
perturbed system lying close to the one energy-level EH’ of the unper- 
turbed system, their number being anything not exceeding the 
number of independent states of the unperturbed system belonging 
to the energy-level EZ’. In this way the perturbation may cause a 
separation or partial separation of the energy-levels that colnet 
at E’ for the unperturbed system. 

Equation (7) also determines, to the zero order, the representatives 
<E"B" |0> of the stationary states of the perturbed system belonging 
to energy-levels lying close to E’, any solution f(f’) of (7) substituted 
n (5) giving one such representative. Each of these stationary states 
of the perturbed system approximates to one of the stationary states 
of the unperturbed system, but the converse, that each stationary 
state of the unperturbed system approximates to one of the stationary 
states of the perturbed system, is uot true, since the general 
stationary state of the unperturbed system belonging to the energy- 
level E’ is represented by the right-hand side of (5) with an arbitrary 
function f(8”). The problem of finding which stationary states of 
the unperturbed system approximate to stationary states of the 
perturbed system, i.e. the problem of finding the solutions f(f’) of 
(7), corresponds to the problem of ‘secular perturbations’ in classical 
mechanics. It should be noted that the above results are indepen- 
dent of the values of all those matrix elements of the perturbing 

t To distinguish these energy-levels one from another we should require some 
more elaborate notation, since according to the present notation they must all be 
specified by the same number of primes, namely by the number of primes specifying 


the energy-level of the unperturbed system from which they arise. For our present 
purposes, however, this more elaborate notation is not required. 
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energy which refer to two different energy-levels of the unperturbed 
system. 

Let us see what the above results become in the specially simple case 
when there is only one stationary state of the unperturbed system 
belonging to each energy-level.+ In this case EH alone fixes the repre- 
sentation, no f’s being required. The sum in (7) now reduces to a 
single term and we get 

a, = CE" |V|E". (8) 
There is only one energy-level of the perturbed system lying close to 
any energy-level of the unperturbed system and the change in energy 
is equal, in the first order, to the corresponding diagonal element of the 
perturbing energy in the Heisenberg representation for the unperturbed 
system, or to the average value of the perturbing energy for the correspond- 
ing unperturbed state. The latter formulation of the result is the same 
as in classical mechanics when the unperturbed system is multiply 
periodic. 

We shall proceed to calculate the second-order correction a, in 
the energy-level for the case when the unperturbed system is non- 
degenerate. Equation (5) for this case reads 


<E" |0> ee Szrz’s 


with neglect of an unimportant numerical factor, and equation (6) 


reads Ci Fy +0, See: = CE" VE’. 
This gives us the value of ¢£”|1) when E”  E’, namely 
niyy — SETIVIE? : 
<E |1> =. i sane . ( ) 


The third of equations (4), written in terms of representatives, 
becomes 
(E’ — E")( EB" |2)-+4,(B" 1) +428 p97 = 2 (E'\V EM) ED). 
Putting E” = E’ here, we get 
a,(B"1)-+a, = EVE" ><E" |), 


which reduces, with the help of (8), to 
t=) CE. 
EXFE’ 


+ A system with only one stationary state belonging to each energy-level is often 
called non-degenerate and one with two or more stationary states belonging to an 
energy-level is called degenerate, although these words are not very appropriate from 
the modern point of view. 
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Substituting for (Z”|1) from (9), we obtain finally — 
a SEED ENV ED 
2 =n E'— Ek" 2 
E’FE’ 
giving for the total energy change to the second order 


7 THe Be V E’y 
acy CEES ce) 


AE 
The method may be developed for the calculation of the higher 
approximations if required. General recurrence formulas giving the 
nth order corrections in terms of those of lower order have been 
obtained by Born, Heisenberg, and Jordan.+ 


44, The perturbation considered as causing transitions 

We shall now consider the second of the two perturbation methods 
mentioned in § 42. We suppose again that we have an unperturbed 
system governed by a Hamiltonian H which does not involve the 
time explicitly, and a perturbing energy V which can now be an 
arbitrary function of the time. The Hamiltonian for the perturbed 
system is again H = E+V. For the present method it does not 
make any essential difference whether the energy-levels of the 
unperturbed system, i.e. the eigenvalues of H, form a discrete or 
continuous set. We shall, however, take the discrete case, for 
definiteness. We shall again work with a Heisenberg representation 
for the unperturbed system, but as there will now be no advantage in 
taking # itself as one of the observables whose eigenvalues label the 
representatives, we shall suppose we have a general set of «’s to label 
the representatives. . 

Let us suppose that at the initial time fo the system is in a state for 
which the «’s certainly have the values a’. The ket corresponding to 
this state is the basic ket |x’). If there were no perturbation, i.e. if the 
Hamiltonian were £, this state would be stationary. The perturba- 
tion causes the state to change. At time ¢ the ket corresponding to the 
state in Schrédinger’s picture will be T'|«’>, according to equation (1) 
of § 27. The probability of the «’s then having the values w” is 


P(a'ee") = |<a"|T a’) |?. (11) 


For «” 4 «’, P(«’x") is the probability of a transition taking pee 
from state «’ to state «” during the time interval ty >t, while P(a’a’ 


+ Z.f. Physik, 35 (1925), 565. 
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is the probability of no transition taking place at all. The sum of 
P(«’a”) for all x” is, of course, unity. 

Let us now suppose that initially the system, instead of being 
certainly in the state a’, is in one or other of various states «’ with 
the probability P,. for each. The Gibbs density corresponding to this 
distribution is, according to (68) of § 33 


p= la )Pyda'| (12) 


At time ¢, each ket |«’> will have changed to T'|«’> and each bra <a’ | 
to <a’|T’, so p will have changed to 


pr = > Tho’>Py<o' |P. (3) 
The probability of the «’s then having the values «” will be, from 
(73) of § 33, <a” | pyla”> = > <a" |T\a’>P <a’ |T x”) 


= > Py Pla'a” (14) 


with the help of (11). This result expresses that the probability of 
the system being in the state ~” at time t is the sum of the probabilities 
of the system being initially in any state a’ ~ «”, and making a transi- 
tion from state «’ to state «” and the probability of its being initially 
in the state «” and making no transition. Thus the various transition 
probabilities act independently of one another, according to the 
ordinary laws of probability. 

The whole problem of calculating transitions thus reduces to the 
determination of the probability amplitudes <«”|7'|«’>. These can be 
worked out from the differential equation for 7’, equation (6) of § 27, or 


mat jdt = AT = (E+V)T. (15) 

The calculation can be simplified by working with 
* — okt, (16) 

We have thaT*/dt = e#t-+oh(_ ET +1h dT /dt) 
= tMwnyT — YT, (17) 
where UV Seer carn, | (18) 


i.e. V* is the result of applying a certain unitary transformation to V. 
Equation (17) is of a more convenient form than (15), because (17) 
makes the change in 7'* depend entirely on the perturbation V, and 
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for V = 0 it would make 7'* equal its initial value, namely unity. 
We have from (16) 


(a"|T#|a") = ett toi a"| T |a'), 


so that P(e |Ka" (TP (4; (19) 
showing that 7* and T are equally good for determining transition 
probabilities. 


Our work up to the present has been exact. We now assume V is 
a small quantity of the first order and express 7'* in the form 


T* — 14784 7TF+..., (20) 
where 7* is of the first order, 7'¥ is of the second, and so on. Substi- 
tuting (20) into (17) and equating terms of equal order, we get 

marTar—V*, 
aT Fidu V*TT, (21) 


From the first of these equations we obtain 


t 
T* = —if-} | V*(t') dt’, (22) 
to 


from the second we obtain 
t 


rie 

T* — —j-2 i] V*(t') dt’ f V*(t’) dt”, (23) 
to to 

and so on. For many practical problems it is sufficiently accurate to 
retain only the term 7'*, which gives for the transition probability 
P(c'x”) with a” < a’ 
2 
Pic oat a 


t 
<a” | | V*(t’) dt’ |a’> 
to 


(24) 


= h-2 


t 
| <a" |V*(t") |’ dt’ 
to 


a 


We obtain in this way the transition probability to the second order 
of accuracy. The result depends only on the matrix element 
<a" |\V*(t')|a’> of V*(t’) referring to the two states concerned, with ¢’ 
going from ¢, to t. Since V* is real, like V, 

<a" |V*(t')|oa’> = <a’ |V*E) [ax 
and hence Picton Rian. (25) 


to the second order of accuracy. 


§ 44 PERTURBATION CAUSING TRANSITIONS 175 


Sometimes one is interested in a transition «’ > a” such that the 
matrix element <«”|V*|x’> vanishes, or is small compared with other 
matrix elements of V*. It is then necessary to work to a higher 
accuracy. If we retain only the terms 7'* and 7%, we get, for «” 4a’, 


P(a'a") == hi-? 


t 
| ax” |V*(t')|a’> dt’ — 
to 


t t 
<1 Sf eve ylary av | amivMeiar> ae (26) 
Re = te 

The terms «” = «’ and a” = «” are omitted from the sum since they 
are small compared with other terms of the sum, on account of the 
smallness of <«”|V*|«’>. To interpret the result (26), we may suppose 
that the term t 
J <a" VC) Ia’> ae (27) 
ty 


gives rise to a transition directly from state «’ to state «”, while the 
term t v 
—h-} | <a" |V*(t') a”) dt’ i Kot” |V*(t") |x’) dt” (28) 
to ty 

gives rise to a transition from state a’ to state «”, followed by a 
transition from state «” to state «”. The state «” is called an inter- 
mediate state in this interpretation. We must add the term (27) to the 
various terms (28) corresponding to different intermediate states 
and then take the square of the modulus of the sum, which means 
that there is interference between the different transition processes— 
the direct one and those involving intermediate states—and one can- 
not give a meaning to the probability for one of these processes by 
itself. For each of these processes, however, there is a probability 
amplitude. If one carries out the perturbation method to a higher 
degree of accuracy, one obtains a result which can be interpreted 
similarly, with the help of more complicated transition processes 
involving a succession of intermediate states. 


45. Application to radiation 

In the preceding section a general theory of the perturbation of an 
atomic system was developed, in which the perturbing energy could 
vary with the time in an arbitrary way. A perturbation of this 
kind can be realized in practice by allowing incident electromagnetic 
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radiation to fall on the system. Let us see what our result (24) reduces 
to in this case. 

If we neglect the effects of the magnetic field of the incident radia- 
tion, and if we further assume that the wave-lengths of the harmonic 
components of this radiation are all large compared with the dimen- 
sions of the atomic system, then the perturbing energy is simply the 


scalar product V = (D, 8), (29) 


where D is the total electric displacement of the system and € is 
the electric force of the incident radiation. We suppose € to be a 
given function of the time. If we take for simplicity the case when 
the incident radiation is plane polarized with its electric vector in 
a certain direction and let D denote the Cartesian component of D 
in this direction, the expression (29) for V reduces to the ordinary 


product V =Dé 
where € is the magnitude of the vector €. The matrix elements of 
— (aI |a"> = <a" Dla’ 8, 
since € is a number. The matrix element <a”|D|a’> is independent 
of ¢. From (18) 

<a” |V*(t)|a"> = <a" |D|a!deXB EMRE (t), 


and hence the expression (24) for the transition probability becomes 
4 2 

P(a'a”) = h-?| <a" |D|a’> |? J el Eee’) ar| ; (30) 

to 

If the incident radiation during the time interval t, to ¢ is resolved 

into its Fourier components, the energy crossing unit area per unit 

frequency range about the frequency v will be, according to classical 
electrodynamics, t 


— ©] [ cemmrtwoeryy ail é 
B= < fen &(t’) d’| . (31) 
to 
Comparing this with (30), we obtain 
P(a'a”) == 2reh-?| <x" |D a’) |2E,, (32) 
where v= |E"—E’|/h. (33) 


From this result we see in the first place that the transition proba- 
bility depends only on that Fourier component of the incident radia- 
tion whose frequency v is connected with the change of energy by (33). 
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This gives us Bohr’s Frequency Condition and shows how the ideas 
of Bohr’s atomic theory, which was the forerunner of quantum 
mechanics, can be fitted in with quantum mechanics. 

The present elementary theory does not tell us anything about the 
energy of the field of radiation. It would be reasonable to assume, 
though, that the energy absorbed or liberated by the atomic system 
in the transition process comes from or goes into the component of 
the radiation with frequency v given by (33). This assumption will 
be justified by the more complete theory of radiation given in 
Chapter X. The result (32) is then to be interpreted as the proba- 
bility of the system, if initially in the state of lower energy, absorb- 
ing radiation and being carried to the upper state, and if initially in 
the upper state, being stimulated by the incident radiation to emit 
and fall to the lower state. The present theory does not account for 
the experimental fact that the system, if in the upper state with no 
incident radiation, can emit spontaneously and fall to the lower state, 
but this also will be accounted for by the more complete theory of 
Chapter X. 

The existence of the phenomenon of stimulated emission was in- 
ferred by Einstein,} long before the discovery of quantum mechanics, 
from a consideration of statistical equilibrium between atoms and a 
field of black-body radiation satisfying Planck’s law. Einstein showed 
that the transition probability for stimulated emission must equal 
that for absorption between the same pair of states, in agreement 
with the present quantum theory, and deduced also a relation con- 
necting this transition probability with that for spontaneous emission, 
which relation is in agreement with the theory of Chapter X. 

The matrix element <«”|D|«’> in (32) plays the part of the ampli- 
tude of one of the Fourier components of D in the classical theory of 
a multiply-periodic system interacting with radiation. In fact it was 
the idea of replacing classical Fourier components by matrix elements 
which led Heisenberg to the discovery of quantum mechanics in 1925. 
Heisenberg assumed that the formulas describing the interaction with 
radiation of a system in the quantum theory can be obtained from 
the classical formulas by substituting for the Fourier components of 
the total electric displacement of the system the corresponding matrix 
elements. According to this assumption applied to spontaneous emis- 
sion, a system having an electric moment D will, when in the state 


+ Einstein, Phys. Zeits. 18 (1917), 121. 
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a’, spontaneously emit radiation of frequency v = (£'— E”)/h, where 
E” is an energy-level, less than E’, of some state «”, at the rate 


arv)* ; 
Fone [Xa" Dla (34) 


The distribution of this radiation over the different directions of 
emission and its state of polarization for each direction will be the 
same as that for a classical electric dipole of moment equal to the 
real part of <«”|D|«’>. To interpret this rate of emission of radiant 
energy as a transition probability, we must divide it by the quantum 
of energy of this frequency, namely hv, and call it the probability per 
unit time of this quantum being spontaneously emitted, with the 
atomic system simultaneously dropping to the state «” of lower 
energy. These assumptions of Heisenberg are justified by the present 
radiation theory, supplemented by the spontaneous transition theory 
of Chapter X. 


46. Transitions caused by a perturbation independent of the 
time 

The perturbation method of § 44 is still valid when the perturbing 
energy V does not involve the time ¢ explicitly. Since the total 
Hamiltonian H in this case does not involve ¢ explicitly, we could 
now, if desired, deal with the system by the perturbation method of 
§ 43 and find its stationary states. Whether this method would be 
convenient or not would depend on what we want to find out about 
the system. If what we have to calculate makes an explicit reference 
to the time, e.g. if we have to calculate the probability of the system 
being in a certain state at one time when we are given that it is in a 
certain state at another time, the method of § 44 would be the more 
convenient one. 

Let us see what the result (24) for the transition probab‘ lity becomes 
when V does not involve ¢ explicitly and let us take t, = 0 to simplify 
the writing. The matrix element <«”|V|«’> is now independent of f, 


and from (18) <a"|V*(t')|a’> = <a" |V ja’ et” EVI, (35) 
t {EEN hi] 
80 <a” V (t’) a’) dt’ = < u V , eaaliaala 


provided £” + E’. Thus the transition probability (24) becomes 
P(a'a") = |Ka"|V |a'> [[ete"-E MA] e-te"—E Mh __ 1|/(#"— EB’)? 
= 2|<a"|V |x’) |?[1 —cos{(E"— 2’ )t/h}]/(B" — BE"). (36) 
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If E” differs appreciably from £’ this transition probability is small 
and remains so for all values of t. This result is required by the law 
of the conservation of energy. The total energy H is constant and 
hence the proper-energy E (i.e. the energy with neglect of the part 
V due to the perturbation), being approximately equal to H, must 
be approximately constant. This means that if # initially has the 
numerical value £’, at any later time there must be only a small 
probability of its having a numerical value differing considerably 
from £’. 

On the other hand, when the initial state «’ is such that there exists 
another state «” having the same or very nearly the same proper- 
energy &, the probability of a transition to the final state «” may be 
quite large. The case of physical interest now is that in which there 
is a continuous range of final states «” having a continuous range of 
proper-energy levels E” passing through the value EL’ of the proper- 
energy of the initial state. The initial state must not be one of the 
continuous range of final states, but. may be either a separate discrete 
state or one of another continuous range of states. We shall now have, 
remembering the rules of § 18 for the interpretation of probability 
amplitudes with continuous ranges of states, that, with P(a’a") 
having the value (36), the probability of a transition to a final state 
within the small range «” to «”+d«" will be P(a’«”) da” if the initial 
state «’ is discrete and will be proportional to this quantity if «’ is 
one of a continuous range. 

We may suppose that the «’s describing the final state consist of 
E together with a number of other dynamical variables 8, so that we 
have a representation like that of § 43 for the degenerate case. (The 
B’s, however, need have no meaning for the initial state «’.) We shall 
suppose for definiteness that the f’s have only discrete eigenvalues. 
The total probability of a transition to a final state «" for which the 
f’s have the values fp” and # has any value (there will be a strong 
probability of its having a value near the initial value Z’) will now 
be (or be proportional to) 

j P(a'x") dE” 
= 2 f |CB"B"|V a’>I*[1—cos{(u"—B'y/n}|(B"— BE") ak" (31) 
a ial J |< EB’ +hx/t, B’|V \x'>|?[1—cos x]/x? dx 


—oO 
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if one makes the substitution (Z”— H’)t/k = x. For large values of t 
this reduces to 


Oth} E'B" |V a’) |? | [1—cosa]/x? da 


= Qrth|CE'B"|Via’>|%.. (38) 

Thus the total probability up to time ¢ of a transition to a final state 

for which the f’s have the values f” is proportional to ¢. There is 

therefore a definite probability coefficient, or probability per unit time, 
for the transition process under consideration, having the value 

Qeri*|CH'B"|V Ja’). (39) 

It is proportional to the square of the modulus of the matrix element, 

associated with this transition, of the perturbing energy. 
If the matrix element <H’B”|V|«’> is small compared with other 


matrix elements of V, we must work with the more accurate formula 
(26). We have from (35) 


t t’ 
| <o:"|V*(t") |”) ar’ f <a" |V*(t") |e’ dt” 
0 0 
t ¢’ 
= <a"|Vla”><a"|V jay [ ete —movin di! | etamrammems dt” 
0 0 


t 
MRC MLAC LC MLAL (2"—EWi__ pil" -EW RY ay 
= Poy | {olla -E¥Ih_ oi } de’. 


For E” close to H’, only the first term in the integrand here gives rise 
to a transition probability of physical importance and the second 
term may be discarded. Using this result in (26) we get 
P(a'a” 
cn aleari’y— Si <A" Vla””><a"iV a’? 1—cosf( H"— E’t/h} 
Co | Jo » eo We ape (£” — E’)2 ’ 

which replaces (36). Proceeding as before, we obtain for the transi- 
tion probability per unit time to a final state for which the B’s have 
the values B” and F has a value close to its initial value E’ 

me P E’B'|V \a”><a" |V Ja’ 
E'B"\Vja’y— Ki 
<E'B"|\V |a’> Ps roe 
This formula shows how intermediate states, differing from the initial 


state and final state, play a role in the determination of a probability 
coefficient. 


2a 
h 


2 


(40) 
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In order that the approximations used in deriving (39) and (40) may 
be valid, the time ¢ must be not too small and not too large. It must 
be large compared with the periods of the atomic system in order that 
the approximate evaluation of the integral (37) leading tu the result 
(38) may be valid, while it must not be excessively large or else the 
general formula (24) or (26) will break down. In fact one could make 
the probability (38) greater than unity by taking t large enough. The 
upper limit to ¢ is fixed by the condition that the probability (24) or 
(26), or ¢ times (39) or (40), must be small compared with unity. There 
is no difficulty in ¢ satisfying both these conditions simultaneously 
provided the perturbing energy V is sufficiently small. 


47. The anomalous Zeeman effect 


One of the simplest examples of the perturbation method of § 43 
is the calculation of the first-order change in the energy-levels of an 
atom caused by a uniform magnetic field. The problem of a hydrogen 
atom in a uniform magnetic field has already been dealt with in § 41 
and was so simple that perturbation theory was unnecessary. The 
case of a general atom is not much more complicated when we make 
a few approximations such that we can set up a simple model for the 
atom. 

We first of all consider the atom in the absence of the magnetic 
field and look for constants of the motion or quantities that are 
approximately constants of the motion. The total angular momen- 
tum of the atom, the vector j say, is certainly a constant of the 
motion. This angular momentum may be regarded as the sum of two 
parts, the total orbital angular momentum of all the electrons, | say, 
and the total spin angular momentum, s say. Thus we have j=I+s. 
Now the effect of the spin magnetic moments on the motion of the 
electrons is small compared with the effect of the Coulomb forces and 
may be neglected as a first approximation. With this approximation 
the spin angular momentum of each electron is a constant of the 
motion, there being no forces tending to change its orientation. Thus 
s, and hence also I, will be constants of the motion. The magnitudes, 
l, s, and j say, of 1, s, and j will be given by 


l+3h = (24+0+0+20), 
+P = (f+sh-+-58+-40), 
jth = (+5 ++}, 


$595.57 N 
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corresponding to equation (39) of § 36. They commute with each 

other, and from (47) of § 36 we see that with given numerical values 

for 1 and s the possible numerical values for 7 are 
l+s, lis—h, ..., |l—s|. 

Let us consider a stationary state for which /, s, and 7 have definite 
numerical values in agreement with the above scheme. The energy 
of this state will depend on /, but one might think that with neglect 
of the spin magnetic moments it would be independent of s, and 
also of the direction of the vector 8 relative to 1, and thus of 7. It will 
be found in Chapter TX, however, that the energy depends very much 
on the magnitude s of the vector s, although independent of its 
direction when one neglects the spin magnetic moments, on account 
of certain phenomena arising from the fact that the electrons are 
indistinguishable one from another. There are thus different energy- 
levels of the system for each different value of / and s. This means 
that / and s are functions of the energy, according to the general 
definition of a function given in § 11, since the / and s of a stationary 
state are fixed when the energy of that state is fixed. 

We can now take into account the effect of the spin magnetic 
moments, treating it as a small perturbation according to the method 
of § 43. The energy of the unperturbed system will still be approxi- 
mately a constant of the motion and hence / and s, being functions 
of this energy, will still be approximately constants of the motion. 
The directions of the vectors 1 and s, however, not being functions of 
the unperturbed energy, need not now be approximately constants 
of the motion and may undergo large secular variations. Since the 
vector j is constant, the only possible variation of 1 and s is a pre- 
cession about the vector j. We thus have an approximate model of 
the atom consisting of the two vectors 1 and s of constant lengths 
precessing about their sum j, which is a fixed vector. The energy is 
determined mainly by the magnitudes of 1 and s and depends only 
slightly on their relative directions, specified by 7. Thus states with 
the same / and s and different j will have only slightly different 
energy-levels, forming what is called a multiplet term. 

Let us now take this atomic model as our unperturbed system and 
suppose it to be subjected to a uniform magnetic field of magnitude # 
in the direction of the z-axis. The extra energy due to this magnetic 
field will consist of a term 

e# /2mc.(m,+iha,), (41) 
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like the last term in equation (89) of § 41, contributed by each 
electron, and will thus be altogether 

et /2mc. > (m,+-ho,) = e#/2mc. (l,+2s,) = eF#/2mc.(j,+s8,). (42) 
This is our perturbing energy V. We shall now use the method of 
§ 43 to determine the changes in the energy-levels caused by this V. 
The method will be legitimate only provided the field is so weak that 
V is small compared with the energy differences within a multiplet. 

Our unperturbed system is degenerate, on account of the direction 

of the vector j being undetermined. We must therefore take, from 
the representative of V in a Heisenberg representation for the un- 
perturbed system, those matrix elements that refer to one particular 
energy-level for their row and column, and obtain the eigenvalues of 
the matrix thus formed. We can do this best by first splitting up V 
into two parts, one of which is a constant of the unperturbed motion, 
so that its representative contains only matrix elements referring to 
the same unperturbed energy-level for their row and column, while 
the representative of the other contains only matrix elements refer- 
ring to two different unperturbed energy-levels for their row and 
column, so that this second part does not affect the first-order per- 
turbation. The term involving j, in (42) is a constant of the un- 
perturbed motion and thus belongs entirely to the first part. For the 
term involving s, we have 

8 J2+-93 +92) a INBeIa + Sy Jy See) + (S2I2—Is S2)J2+ (82Jy—Je Sy)Jy 
or 


i ey ye | ie 4, | 18 
ee Pe Sedu —Ie8y = sel, —lys; = L, ot. Sy; (44) 
Vy = I282—S2)2 = L,8,—8zl,, = L,8,—L, 8,. 


The first term in this expression for s, is a constant of the unperturbed 
motion and thus belongs entirely to the first part, while the second 
term, as we shall now see, belongs entirely to the second part. 

Corresponding to (44) we can introduce 

y, = 1,8,—l,8,. 
It can now easily be verified that 
GEV tI Veoeye = 0 
and from (30) of § 35 
[digs Ya] = Vy [iz> y] — yp lige, Al = 0. 
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These relations connecting j,, j,,j, and y,, Vy; yz are of the same form 
as the relations connecting m,, m,, m, and 2, y, z in the calculation 
in § 40 of the selection rule for the matrix elements of z in a repre- 
sentation with & diagonal. From the result there obtained that all 
matrix elements of z vanish except those referring to two k values 
differing by -+-%, we can infer that all matrix elements of y,, and 
similarly of y, and y,, in a representation with j diagonal, vanish 
except those referring to two j values differing by +. The coeffi- 
cients of y, and y, in the second term on the right-hand side of (43) 
commute with j, so the representative of the whole of this term will 
contain only matrix elements referring to two j values differing by 
+, and thus referring to two different energy-levels of the unper- 
turbed system. 

Hence the perturbing energy V becomes, when we neglect that 
part of it whose representative consists of matrix elements referring 
to two different unperturbed energy-levels, 

Sle (1 age een 
2me* 29(j +h) 

The eigenvalues of this give the first-order changes in the energy- 
levels. We can make the representative of this expression diagonal 
by choosing our representation such that j, is diagonal, and it then 
gives us directly the first-order changes in the energy-levels caused by 
the magnetic field. This expression is known as Landé’s formula. 

The result (45) holds only provided the perturbing energy V is small 
compared with the energy differences within a multiplet. For larger 
values of V a more complicated theory is required. For very strong 
fields, however, for which V is large compared with the energy differ- 
ences within a multiplet, the theory is again very simple. We may 
now neglect altogether the energy of the spin magnetic moments for 
the atom with no external field, so that for our unperturbed system 
the vectors 1 and s themselves are constants of the motion, and not 
merely their magnitudes / and s. Our perturbing energy V, which is 
still eA#/2mc.(j,-+s,), is now a constant of the motion for the unper- 
turbed system, so that its eigenvalues give directly the changes in the 
energy-levels. These eigenvalues are integral or half-odd integral 
multiples of eA/%/2mc according to whether the number of electrons 
in the atom is even or odd. 


; (45) 
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48. General remarks 


In this chapter we shall investigate problems connected with a par- 
ticle which, coming from infinity, encounters or ‘collides with’ some 
atomic system and, after being scattered through a certain angle, goes 
off to infinity again. The atomic system which does the scattering 
we shall call, for brevity, the scatterer. We thus have a dynamical 
system composed of an incident particle and a scatterer interacting 
with each other, which we must deal with according to the laws of 
quantum mechanics, and for which we must, in particular, calculate 
the probability of scattering through any given angle. The scatterer 
is usually assumed to be of infinite mass and to be at rest throughout 
the scattering process. The problem was first solved by Born by a 
method substantially equivalent to that of the next section. We must 
take into account the possibility that the scatterer, considered as a 
system by itself, may have a number of different stationary states 
from infinity, it may be left in a different one when the particle goes 
off to infinity again. The colliding particle may thus induce transi- 
tions in the scatterer. 

The Hamiltonian for the whole system of scatterer plus particle 
will not involve the time explicitly, so that this whole system will 
have stationary states represented by periodic solutions of Schré- 
dinger’s wave equation. The meaning of these stationary states 
requires a little care to be properly understood. It is evident that 
for any state of motion of the system the particle will spend nearly all 
its time at infinity, so that the time average of the probability of the 
particle being in any finite volume will be zero. Now for a stationary 
state the probability of the particle being in a given finite volume, 
like any other result of observation, must be independent of the time, 
and hence this probability will equal its time average, which we have 
seen is zero. Thus only the relative probabilities of the particle being 
in different finite volumes will be physically significant, their absolute 
values being all zero. The total energy of the system has a continuous 
range of eigenvalues, since the initial energy of the particle can be 
anything. Thus a ket, |s> say, corresponding to a stationary state, 
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being an eigenket of the total energy, must be of infinite length. We 
can see a physical reason for this, since if |s) were normalized and if 
Q denotes that observable—a certain function of the position of 
the particle—that is equal to unity if the particle is in a given finite 
volume and zero otherwise, then <s|Q|s> would be zero, meaning that 
the average value of Q, i.e. the probability of the particle being in the 
given volume, is zero. Such a ket |s> would not be a convenient one 
to work with. However, with |s> of infinite length, <s|Q|s> can be 
finite and would then give the relative probability of the particle 
being in the given volume. 

In picturing a state of a system corresponding to a ket |x> which 
is not normalized, but for which <x|x> = n say, it may be convenient 
to suppose that we have n similar systems all occupying the same 
space but with no interaction between them, so that each one follows 
out its own motion independently of the others, as we had in the 
theory of the Gibbs ensemble in § 33. We can then interpret <x|«|zx), 
where « is any observable, directly as the total o for all the n systems. 
In applying these ideas to the above-mentioned |s) of infinite length, 
corresponding to a stationary state of the system of scatterer plus 
colliding particle, we should picture an infinite number of such sys- 
tems with the scatterers all located at the same point and the particles 
distributed continuously throughout space. The number of particles 
in a given finite volume would be pictured as <s|Q|s>, Q being the 
observable defined above, which has the value unity when the particle 
is in the given volume and zero otherwise. If the ket is represented 
by a Schrédinger wave function involving the Cartesian coordinates 
of the particle, then the square of the modulus of the wave function 
could be interpreted directly as the density of particles in the picture. 
One must remember, however, that each of these particles has its own 
individual scatterer. Different particles may belong to scatterers in 
different states. There will thus be one particle density for each state 
of the scatterer, namely the density of those particles belonging to 
scatterers in that state. This is taken account of by the wave function 
involving variables describing the state of the scatterer in addition 
to those describing the position of the particle. 

For determining scattering coefficients we have to investigate 
stationary states of the whole system of scatterer plus particle. For 
instance, if we want to determine the probability of scattering in 
various directions when the scatterer is initially in a given stationary 
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state and the incident particle has initially a given velocity in a given 
direction, we must investigate that stationary state of the whole 
system whose picture, according to the above method, contains at 
great distances from the point of location of the scatterers only 
particles moving with the given initial velocity and direction and 
belonging each to a scatterer in the given initial stationary state, 
together with particles moving outward from the point of location 
of the scatterers and belonging possibly to scatterers in various 
stationary states. This picture corresponds closely to the actual state 
of affairs in an experimental determination of scattering coefficients, 
with the difference that the picture really describes only one actual 
system of scatterer plus particle. The distribution of outward moving 
particles at infinity in the picture gives us immediately all the infor- 
mation about scattering coefficients that could be obtained by experi- 
ment. For practical calculations about the stationary state described 
by this picture one may use a perturbation method somewhat like 
that of § 43, taking as unperturbed system, for example, that for 
which there is no interaction between the scatterer and particle. 

In dealing with collision problems, a further possibility to be taken 
into consideration is that the scatterer may perhaps be capable of 
absorbing and re-emitting the particle. This possibility arises when 
there exists one or more states of absorption of the whole system, a 
state of absorption being an approximately stationary state which 
is closed in the sense mentioned at the end of § 33 (i.e. for which 
the probability of the particle being at a greater distance than r from 
the scatterer tends to zero as r->0o). Since a state of absorption is 
only approximately stationary, its property of being closed will be 
only a transient one, and after a sufficient lapse of time there will be 
a finite probability of the particle being on its way to infinity. 
Physically this means there is a finite probability of spontaneous 
emission of the particle. The fact that we had to use the word 
‘approximately’ in stating the conditions required for the phenomena 
of emission and absorption to be able to occur shows that these condi- 
tions are not expressible in exact mathematical language. One can give 
a meaning to these phenomena only with reference to a perturbation 
method. They occur when the unperturbed system (of scatterer plus 
particle) has stationary states that are closed. The introduction of the 
perturbation spoils the stationary property of these states and gives 
rise to spontaneous emission and its converse absorption. 
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For calculating absorption and emission probabilities it is necessary 
to deal with non-stationary states of the system, in contradistinction 
to the case for scattering coefficients, so that the perturbation method 
of § 44 must be used. Thus for calculating an emission coefficient 
we must consider the non-stationary states of absorption described 
above. Again, since an absorption is always followed by a re-emission, 
it cannot be distinguished from a scattering in any experiment in- 
volving a steady state of affairs, corresponding to a stationary state 
of the system. The distinction can be made only by reference to a 
non-steady state of affairs, e.g. by use of a stream of incident particles 
that has a sharp beginning, so that the scattered particles will appear 
immediately after the incident particles meet the scatterers, while 
those that have been absorbed and re-emitted will begin to appear 
only some time later. This stream of particles would be the picture 
of a certain ket of infinite length, which could be used for calculating 
the absorption coefficient. 


49. The scattering coefficient 

We shall now consider the calculation of scattering coefficients, 
taking first the case when there is no absorption and emission, which 
means that our unperturbed system has no closed stationary states. 
We may conveniently take this unperturbed system to be that for 
which there is no interaction between the scatterer and particle. Its 
Hamiltonian will thus be of the form 


E= AW, (1) 


where H, is that for the scatterer alone and W that for the particle 
alone, namely, with neglect of relativistic mechanics, 


W = 1/2m.(pi+p}+p?). (2) 
The perturbing energy V, assumed small, will now be a function of 
the Cartesian coordinates of the particle x, y, z, and also, perhaps, 
of its momenta p,, py, p,, together with dynamical variables describ- 
ing the scatterer. 

Since we are now interested only in stationary states of the whole 
system, we use a perturbation method like that of §43. Our unper- 
turbed system now necessarily has a continuous range of energy- 
levels, since it contains a free particle, and this gives rise to certain 
modifications in the perturbation method. The question of the change 
in the energy-levels caused by the perturbation, which was the main 
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question of § 43, no longer has a meaning, and the convention in § 43 
of using the same number of primes to denote nearly equal eigen- 
values of # and H now drops out. Again, the splitting of energy- 
levels which we had in § 43 when the unperturbed system is degenerate 
cannot now arise, since if the unperturbed system is degenerate the 
perturbed one, which must also have a continuous range of energy- 
levels, will also be degenerate to exactly the same extent. 

We again use the general scheme of equations developed at the 
beginning of § 43, equations (1) to (4) there, but we now take our 
unperturbed stationary state forming the zero-order approximation 
to belong to an energy-level Z’ just equal to the energy-level H’ of 
our perturbed stationary state. Thus the a’s introduced in the second 
of equations (3) § 43 are now all zero and the second of equations 


(4) there now reads (E’—B)|1) = V0). (3) 
Similarly, the third of equations (4) § 43 now reads 
(H'— B)|2) = Vl. (4) 
We shall proceed to solve equation (3) and to obtain the scattering 
coefficient to the first order. We shall need equation (4) in § 51. 
Let a denote a complete set of commuting observables describing 
the scatterer, which are constants of the motion when the scatterer is 
alone and may thus be used for labelling the stationary states of the 
scatterer. This requires that H, shall commute with the «’s and be 
a function of them. We can now take a representation of the whole 
system in which the «’s and 2, y, z, the coordinates of the particle, 
are diagonal. This will make H, diagonal. Let |0> be represented by 
<xa’|0> and |1> by <xa’ll>, the single variable x being written to 
denote x, y, z and the prime being omitted from x for brevity. Also 
the single differential d°x will be written to denote the product dadydz. 
Equation (3), written in terms of representatives, becomes, with the 
help of (1) and (2), 
{E’ —H,(a!)-+#2/2m. V3}{x«' |L> = ¥ [ (xa"|V|x"a") d2x"{x"x"|0). 
(5) 
Suppose that the incident particle has the momentum p° and that 
the initial stationary state of the scatterer is «°. The stationary state 
of our unperturbed system is now the one for which p = p® and 
a == «°, and hence its representative is 


CO = Sura eR? xh (6) 
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This makes equation (5) reduce to 
{H! —H,(a!)-+1/2m.. V3}{xa'|1) = [ (xa"|V|x%a% d8x0 ofto".a0n 


or (k?+-V?)<xa’|1> = F, (7) 
where k? = 2mh-*{ HE’ — H,(a’)} : (8) 
and F = mh | < xox’ |V |x d3z9 et(P?. xh (9) 


a definite function of x, y, z, and «’. We must also have 

E’ = H,(«°)+ p™/2m. (10) 
Our problem now is to obtain a solution <x«’|1> of (7) which, for 
values of x, y, 2 denoting points far from the scatterer, represents 
only outward moving particles. The square of its modulus, |(xa’' |1)|?, 
will then give the density of scattered particles belonging to scatterers 
in the state «’ when the density of the incident particles is |(xa|0>|?, 
which is unity. If we transform to polar coordinates r, 6, 6, equation 

(7) becomes 

ae 

or — sin 6 20° *28 re = a 


Now F must tend to zero as r-> 00, on account of the physical re- 
quirement that the interaction energy between the scatterer and 
particle must tend to zero as the distance between them tends to 
infinity. If we neglect F in (11) altogether, an approximate solution 
for large r is <rOdbox’ [1 = u( Ber’ reser, (12) 
where w is an arbitrary function of 6, ¢, and a’, since this expression 
substituted in the left-hand side of (11) gives a result of order r-3. 
When we do not neglect F, the solution of (11) will still be of the 
form (12) for large r, provided F tends to zero sufficiently rapidly as 
r —> 00, but the function w will now be definite and determined by the 
solution for smaller values of r. : 

For values a’ of the «’s such that k®, defined by (8), is positive, the 
k in (12) must be chosen to be the positive square root of k2, in order 
that (12) may represent only outward moving particles, i.e. particles 
for which the radial component of momentum, which from § 38 
equals p,—tir- or —ih(0/ér+r—), has a positive value. We now 
have that the density of scattered particles belonging to scatterers in 
state «’, equal to the square of the modulus of (12), falls off with 
“increasing 7 according to the inverse square law, as is physically 


<r8ga'|1> = F. (11) 
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necessary, and their angular distribution is given by |u(O¢a’)|?. 
Further, the magnitude, P’ say, of the momentum of these scattered 
particles must equal kh, the momentum being radial for large r, 
so that their energy is equal to 


12 272 02 
Sram ag = BP Hilal) = Bylot) Hol), 
with the help of (8) and (10). This is just the energy of an incident 
particle, namely p°/2m, reduced by the increase in energy of the 
scatterer, namely H,(a’)—H,(o°)s in agreement with the law of con- 
servation of energy. For values «’ of the «’s such that k? is negative 
there are no scattered particles, the total initial energy being insuffi- 
cient for the scatterer to be left in the state a’. 

We must now evaluate u(@¢a’) for a set of values «’ for the a’s such 
that k? is positive, and obtain the angular distribution of the scattered 
particles belonging to scatterers in state «’. It is sufficient to evaluate 
u for the direction 6 = 0 of the pole of the polar coordinates, since 
this direction is arbitrary. We make use of Green’s theorem, whieh 
states that for any two functions of position A and B the volume 
integral | (AV? B— BV?A) d’x taken over any volume equals the 
surface integral ( (Aéb/on— BoA/on) dS taken over the boundary 
of the volume, 0/én denoting differentiation along the normal to 
the surface. We take 

A = emthros), B= (70a |1> 
and apply the theorem to a large sphere with the origin as centre. 
The volume integrand is thus 

e-tkr e088 Y2¢7Aba' |1 > — CrOgar’ |1>V2e~thr 08 8 

ss e~tkrcos 8(72+ f2) (7Adhcx’ |1> = e~tkreos) Fr 

from (7) or (11), while the surface integrand is, with the help of (12), 
aioe <r Og’ |ly>— <rb pa’ |1> aaa 
ik 


1 4  Uiges , 
— p—tkrcos 6 etkr a etkr], cos 0 e-tkr cos 0 
— af = Ps + ro ob i 


= tkur-*(1+-cos B)etrA—co8 8) 


with neglect of r-?. Hence we get 


27 T 
[ e-threosl 7 dP — [ dd | r?sin 8 d.ikur-1(1 +cos B)ert-ens9, 
y 0 0 : 
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the volume integral on the left being taken over the whole of space. 
The right-hand side becomes, on being integrated by parts with 
respect to 0, 


27 


| dd (ae + cos O)et#rt—cos ules = | cites cont) F [u(1-+cos 6)] ao), 
0 0 


The second term in the {} brackets is of the order of magnitude of 
r-1, as would be revealed by further partial integrations, and may 
therefore be neglected. We are thus left with 


27 


[ ensttinoea i { dd u(0dx’) = —4ru(0da'), 
0 


giving the value of u(@¢a’) for the direction 0 = 0. 
This result may be written 


u(Oa’) = —(47)-1 { e-tP’rcos Oh FP dx, (13) 


since P’ = kh. If the vector p’ denotes the momentum of the scattered 
electrons coming off in a certain direction (and is thus of magnitude 
P’), the value of u for this direction will be 


u(d’ (0'd'x’) b) = —(47)- 1 { e-to-mn TP dx, 


as follows from (13) if one takes this direction to be the pole of the 
polar coordinates. This becomes, with the help of (9), 


u(0'b'x') Uae — (27)- Imf- mle e-P’ xh qs (xa’ |[V | x%Q 0 3x9 ep’. x%/f 


= —2rmh<p'a'|V | px, (14) 
when one makes a transformation from the coordinates x to the 
momenta p of the particle, using the transformation function (54) 
of § 23. The single letter p is here used as a label for the three 
components of momentum. : 

The density of scattered particles belonging to scatterers in state 
a’ is now given by |u(6’¢’«’)|?/r2. Since their velocity is P’/m, the 
rate at which these particles appear per unit solid angle about the 
direction of the vector p’ will be P’/m. |u(@’¢’a’)|?. The density of 
the incident particles is, as we have seen, unity, so that the number 
of incident particles crossing unit area per unit time is equal to their 
velocity P°/m, where P° is the magnitude of p®. Hence the effective 
area that must be hit by an incident particle in order to be scattered 
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in a unit solid angle about the direction p’ and then belong to a 
scatterer in state «’ will be 


P'/P°, u(O''x’)|? = 4n2m?h2P'/P?.|<p'a’|V|p%|2. (15) 


This is the scattering coefficient for transitions «° > «’ of the scatterer. 
It depends on that matrix element <p’a’|V|p°.> of the perturbing 
energy V whose column p°«° and whose row p’a’ refer respectively to 
the initial and final states of the unperturbed system, between which 
the scattering transition process takes place. The result (15) is thus 
in some ways analogous to the result (24) of § 44, although the 
numerical coefficients are different in the two cases, corresponding 
to the different natures of the two transition processes. 


50. Solution with the momentum representation 

The result (15) for the scattering coefficient makes a reference only 
to that representation in which the momentum p is diagonal. One 
would thus expect to be able to get a more direct proof of the result 
by working all the time in the p-representation, instead of working 
in the x-representation and transforming at the end to the p-repre- 
sentation, as was done in § 49. This would not at first sight appear 
to be a great improvement, as the lack of directness of the x-repre- 
sentation method is offset. by more direct applicability, it being 
possible to picture the square of the modulus of the x-representative 
of a state as the density of a stream of particles in process of being 
scattered. The x-representation method has, however, other more 
serious disadvantages. One of the main applications of the theory 
of collisions is to the case of photons as incident particles. Now a 
photon is not a simple particle but has a polarization. It is evident 
from classical electromagnetic theory that a photon with a definite 
momentum, i.e. one moving in a definite direction with a definite 
frequency, may have a definite state of polarization (linear, circular, 
etc.), while a photon with a definite position, which is to be pictured 
as an electromagnetic disturbance confined to a very small volume, 
cannot have any definite polarization. These facts mean that the 
polarization observable of a photon commutes with its momentum 
but not with its position. This results in the p-representation method 
being immediately applicable to the case of photons, it being only 
necessary to introduce the polarizing variable into the representatives 
and treat it along with the a’s describing the scatterer, while the 
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X-representation method is not applicable. Further, in dealing with 
photons, it is necessary to take relativistic mechanics into account. 
This can easily be done in the p-representation method, but not so 
easily in the x-representation method. 

Equation (3) still holds with relativistic mechanics, but W is now 


given by W?2/c? = mc? P? = mc? +»? -+-p2-+ p? (16) 
instead of by (2). Written in terms of p-representatives, equation (3) 
gives {H’—H,(a')— W}<pa’|1l> = <pa’|V|0>, 


p being written instead of p’ for brevity and W being understood as 
a definite function of p,, p,, p, given by (16). This may be written 


(W'—W)<pa'|1> = <pa’|V|0), (17) 
where W' = B'—H(o’) (18) 


and is the energy required by the law of conservation of energy for 
a scattered particle belonging to a scatterer in state «’. The ket |0> 
is represented by (6) in the x-representation and the basic ket | pa» 
is represented by 

<xa'|p°a° = 8 ya <x|p® = Syrq0 h-tetP? wih, 


from the transformation function (54) of § 23. Hence 


|0> = hi|p%2», (19) 
and equation (17) may be written 
(W'—W)<pa' [> = hi pa’ |V|p%. (20) 


We now make a transformation from the Cartesian coordinates 
Pz» Py: Pz Of p to its polar coordinates P, w, x, given by 


Pz= Pcosw, py, = Psinweoos x, p, = Psinwsin x. 


If in the new representation we take the weight function P25 sinw, 
then the weight attached to any volume of p-space will be the same 
as in the previous p-representation, so that the transformation will 
mean simply a relabelling of the rows and columns of the matrices 
without any alteration of the matrix elements. Thus (20) will become 
in the new representation 


(W’—W)<Pwxa’|1> = hE Pwyxe’ |V | P a°x%Q), (21) 


W being now a function of the single variable P. 
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The coefficient of <Pwya'|1>, namely W’—W, is now simply a 
multiplying factor and not a differential operator as it was with the 
X-representation method. We can therefore divide out by this factor 
and obtain an explicit expression for <Pwxa'|1>. When, however, o’ 
is such that W’, defined by (18), is greater than mc?, this factor will 
have the value zero for a certain point in the domain of the variable 
P, namely the point P = P’, given in terms of W’ by (16). The 
function «Pw x«'|1> will then have a singularity at this point. This 
singularity shows that <Pwya’|1> represents an infinite number of 
particles moving about at great distances from the scatterers with 
energies indefinitely close to W’ and it is therefore this singularity 
that we have to study to get the angular distribution of the particles 
at infinity. 

The result of dividing out (21) by the factor W’— W is, according 
to (13) of § 15, 

(Pawxa’'|1> = hi Pwya’|V | Pwx°a)/(W’ — W) +A(wxa’) 8(W’— W), 
(22) 
where A is an arbitrary function of w, x, and a’. To give a meaning ~ 
to the first term on the right-hand side of (22), we make the conven- 
tion that its integral with respect to P over a range that includes the 
value P’ is the limit when «0 of the integral when the small 
domain P’—e« to P’-+e is excluded from the range of integration. 
This is sufficient to make the meaning of (22) precise, since we are 
interested effectively only in the integrals of the representatives of 
states when the representation has continuous ranges of rows and 
columns. We see that equation (21) is inadequate to determine the 
representative <Pwya’|1> completely, on account of the arbitrary 
function A occurring in (22). We must choose this A such that 
<Pwya'|1> represents only outward moving particles, since we want 
the only inward moving particles to be those corresponding to |0). 

Let us take first the general case when the representative ¢Pwx|> 

of a state of the particle satisfies an equation of the type 


(W’—W)<Pwx|> = f(Pex), (23) 


where f(Pwx) is any function of P, w, and xy, and W’ is a number 
greater than mc?, so that <Pwx|> is of the form 


<Pwx|> = f(Pwx)/(W’—W)+Awx) 3(W'— W), (24) 


and let us determine now what A must be in order that (Pw x|> may 
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represent only outward moving particles. We can do this by trans- 
forming <Pwy|> to the x-representation, or rather the (r6¢)-repre- 
sentation, and comparing it with (12) for large values of r. The 
transformation function is 

<r0¢|Pwy> — f-ietp.xii — f—tetPricosw cos §+sin wsin A cos(y—p)i/h 


For the direction 0 = 0 we find 


co 27 T 
<r04|> = h- i P?dP J dx J sin w dw etPreoswit Pury |) 


cies ih —— f dx — [peg <Poxd | "+ 


=0 


yt eee 
a fe Pie ie 5 Pel, 


The second term in the { } brackets is of oe r-?, as may be verified 
by further partial integrations with respect to w, and can therefore 
be neglected. We are left with 


(oe) 27 
<r06|> = th-*(2nr)-} J PdP J dy {e-*K Pry |>—e!Prlh POy|y} 
0 0 


= thr} P dP {e-*Prik, Prry|)—e!Prit POy|>}. (25) 
| 


When we substitute for (Pwx|> its value given by (24), the first 
term in the integrand im (25) gives 


th-tr-1 J P dP e~*Prit f( Pary)/(W’—W)-+X(mx)8(W’—W)}. (26) 


The térm involving 5(W’—W) here may be integrated immediately 
and gives, when one uses the relation PdP = W dW /c?, which 
follows from (16), 


ih-te-2r-1 i W dW e-*Prit\(ry)3(W'—W) 


= ther W'Vary)etPih, (27) 
To integrate the other term in (26) we use the formula 


—tPrih, 
pape? = WP’) {se pues (28) 
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with neglect of terms involving r-!, for any continuous function g(P), 


i ¢] 


vhich formula holds since | K(P)e?r* dP is of order r-! for any 


0 
continuous function K(P) and since the difference 
ene —P) salP’)|(P’—P) 
is continuous. The right-hand side of (28), when evaluated with 
neglect of terms involving r-!, and also with neglect of the small 
domain P’—e to P’+« in the domain of integration, gives 
f eilP’—Pyrih 
P’—P 


r e-tPrih ; 
HP’) | SempaP = oP erm 


= ig(P/enenn [ IP —PY/h ap _ ing P'ye-tP'1h, (29) 
P’—P 
In our present example g(P) is 
g(Py Swe P f (Pax) (P’—P)(W’—W), 
which has the limiting value when P = P’, 
ae) =a PLP ax) WP’? =ahate—*r" Wf (P's). 

Substituting this in (29) and adding on the expression (27), we obtain 
the following value for the integral (26) 


h-*c~*r 1W'{—af(P’rx)+ir(arx)fetP 7. (30) 
Similarly the second term in the integrand in (25) gives 

h-te—*r-1W'{ —a f(P'0x)—iA(Ox)e*P 7, (31) 
The sum of these two expressions is the value of <r0¢|> when r is 


arge. 
We require that <r0¢|> shall represent only outward moving 
particles, and hence it must be of the form of a multiple of e?7", 
Thus (30) must vanish, so that 

A(mx) = —taf(P ry). (32) 
We see in this way that the condition that <r@¢|> shall represent 
only outward moving particles in the direction 6 = 0 fixes the value 
of A for the opposite direction 9 = 7. Since the direction @ = 0 or 
w = 0 of the pole of our polar coordinates is not in any way singular, 
we can generalize (32) to 

Awx) = —irf(P'wx), (33) 


3595.57 0 
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which gives the value of A for an arbitrary direction. This value 
substituted in (24) gives a result that may be written 


<Pwx|> = f(Pax)1/(W’—W)—i3(W'— W)}, (34) 
since one can substitute P’ for P in the coefficient of a term involving 
5(W’—W) as a factor without changing the value of the term. The 


condition that <Pwx|> shall represent only outward moving particles 1s 
thus that it shall contain the factor 
{1/(W’— W)—ind(W'— W)}. (35) 
It is interesting to note that this factor is of the form of the right- 
hand side of equation (15) of § 15. 
With A given by (33), expression (30) vanishes and the value of 
<r0¢|> for large r is given by expression (31) alone, thus 


<r0b|> = —2ah-tce-*r 1 W'f(P'0x)eP. 
This may be generalized to 
Cg = — 2aho ee W age 


giving the value of <r@¢|> for any direction 6, ¢ in terms of f(P’w ) 
for the same direction labelled by w, xy. This is of the form (12) with 


u(0¢) = —2rh-*c-2W'f(P’wx) 


and thus represents a distribution of outward moving particles of 
momentum P’ whose number is 
OP 5  GeW'P ai 
eats ee 3 
7 lu = = If(P’ax)| (36) 
per unit solid angle per unit time. This distribution is the one 
represented by the <Pwy|> of (34). 
From this general result we can infer that, whenever we have a 
_ representative «Pw x|> representing only outward moving particles 
and satisfying an equation of the type (23), the number per unit solid 
angle per unit time of these particles is given by (36). If this <Pw x|> 
occurs in a problem in which the number of incident particles is one 
per unit volume, it will correspond to a scattering coefficient of 
amount 
47? W°W'P’ , 
het P® |f(P wy) ire (37) 
_It is only the value of the function f(Pwy) for the point P = P’ that 
48 of importance. 
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If we now apply this general theory to our equations (21) and 
(22), we have 


f(Pwx) = hi¢ Pawo’ |V | Px, 
Hence from (37) the scattering coefficient is 
4n*h? WW’ P' [ct P9. |< P’wyal |V | Pye |2, + (38) 


If one neglects relativity and puts W°W’/ct = m?, this result reduces 
to the result (15) obtained in the preceding section by means of 
Green’s theorem. 


51. Dispersive scattering 

We shall now determine the scattering when the incident particle 
is capable of being absorbed, that is, when our unperturbed system 
of scatterer plus particle has closed stationary states with the particle 
absorbed. The existence of these closed states for the unperturbed 
system will be found to have a considerable effect on the scattering 
for the perturbed system, and indeed an effect that depends very 
much on the energy of the incident particle, giving rise to the pheno- 
menon of dispersion in optics when the incident particle is taken to 
be a photon. 

We use a representation for which the basic kets correspond to 
the stationary states of the unperturbed system, as was the case with 
the p-representation of the preceding section. We take these station- 
ary states to be the states (p’«’) for which the particle has a definite 
momentum p’ and the scatterer is in a definite state a’, together with 
the closed states, k say, which form a separate discrete set, and 
assume that these states are all independent and orthogonal. This 
assumption is not accurate when the particle is an electron or atomic 
nucleus, since in this case for an absorbed state & the particle will 
still certainly be somewhere, so that one would expect to be able to 
expand |k> in terms of the eigenkets |x'a’> of x, y, z, and the a’s, 
and hence also in terms of the |p’a’>’s. On the other hand, when the 
particle is a photon it will no longer exist for the absorbed states, 
which are then certainly independent of and orthogonal to the states 
(p’«’) for which the particle does exist. Thus the assumption is valid 
in this case, which is an important practical one. 

Since we are concerned with scattering, we must still deal with 
stationary states of the whole system. We shall now, however, have 
to work to the second order of accuracy, so that we cannot use merely 
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the first-order equation (3), but must use also (4). Equation (3) 
becomes, when written in terms of representatives in our present 
representation, 


(W'—W)<pe'|1> = <pe’|V|0>, 
(E’—E,)<k|l> = <k|V|0>, 


where W’ is the function of HZ’ and the «’’s given by (18) and £, is the 
energy of the stationary state & of the unperturbed system. Similarly, 
equation (4) becomes 


(W’—W)<pa'|2> = ¢pa'|V (1), 
(E’—E,)k|2> = <k| VII. 
Expanding the right-hand sides by matrix multiplication, we get 
(W’—W)<pa'|2) 
= | <po'|Vip’a"> dp” <p"a"|1> + ¥ Cpa’ |V ik” <k"|I, 
(H—B,)<h12> ee 
= J <HIVIp’x"> d2p” <p"a"[1> + ¥ CIV |R CRD. 


) 0 


) a 


The ket |O> is still given by (19), so (39) may be written 
(W’—W)<pa'|1> = hk pa’ |V |p, (42) 
(E’—E,)<k\1> = hAck|V|p°a). (43) 


We may assume that the matrix elements <k’|V |k”> of V vanish, 
since these matrix elements are not essential to the phenomena under 
investigation, and if they did not vanish it would mean simply that 
the absorbed states k had not been suitably chosen. We shall further 
assume that the matrix elements <p’«’|V |p’«”> are of the second order 
of smallness when the matrix elements <k’|V|p"«”), <p’a’|V/|k”) are 
taken to be of the first order of smallness. This assumption will be 
justified for the case of photons in § 64. We now have from (43) and 
(42) that <k|1> is of the first order of smallness, provided E’ does not 
lie near one of the discrete set of energy-levels H,, and <pa’|1)> is of 
the second order. The value of <p«’|2> to the second order will thus 
be given, from the first of equations (41), by 


(W’—W)<pa'|2> = hi p2 <Pa'|V|k"><k"|V | p°a®> /( HE’ — By). 
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The total correction in the wave function to the second order, namely 
“pa’|l> plus <p«’|2>, therefore satisfies 


(W'— WK Pa! |1)+< pa’ |2>} 
= Apa’ |V |p + > po’ |V |k><k|V | px) /(E' — E,)}. 


This equation is of the type (23), provided a’ is such that W’ > me?, 
which means that a’ as a final state for the scatterer is not incon- 
sistent with the law of conservation of energy. We can therefore infer 
from the general result (37) that the scattering coefficient is 
2A2 yoy’ p’ WIV 040 | 
_ <p’a’|V|p% +. > Pee _ (44) 
The scattering may now be considered as composed of two parts, 
a part that arises from the matrix element <p’a’|V|p°a> of the per- 
turbing energy and a part that arises from the matrix elements 
<p’a’|V\|k> and <k|V|p% >. The first part, which is the same as our 
previously obtained result (38), may be called the direct scattering. 
The second part may be considered as arising from an absorption of 
the incident particle into some state k, followed immediately by a 
re-emission in a different direction, and is like the transitions through 
an intermediate state considered in § 44. The fact that we have to 
add the two terms before taking the square of the modulus denotes 
interference between the two kinds of scattering. There is no experi- 
mental way of separating the two kinds, the distinction between 
them being only mathematical. 


52. Resonance scattering 

Suppose the energy of the incident particle to be varied con- 
tinuously while the initial state «° of the scatterer is kept fixed, so 
that the total energy H’ or H’ varies continuously. The formula (44) 
now shows that as H’ approaches one of the discrete set of energy- 
levels E,, the scattering becomes very large. In fact, according to 
formula (44) the scattering should be infinite when E’ is exactly equal 
to an H,. An infinite scattering coefficient is, of course, physically 
impossible, so that we can infer that the approximations used in 
deriving (44) are no longer legitimate when E’ is close to an E,,. To 
investigate the scattering in this case we must therefore go back to 


the exact equation (E'—B)|H') = V\H’), 
equation (2) of § 43 with #’ written for H ‘, and use a different method 
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of approximating to its solution. This exact equation, written in 
terms of representatives like (41), becomes 


(W’—W)<pa'|H"» 

= 3 | <po'|Vip"a"> dp" <p'a"|H')+ ¥ pa’ |V |b" ><k" |’, 
(E’ —B,)<k|H’> 

= 3 | <kiVip'a"> dp" (p'a" |H> + B CEIVED <k"H'Y. 


Let us take one particular H, and consider the case when 4’ is close 
to it. The large term in the scattering coefficient (44) now arises from 
those elements of the matrix representing V that lie in row k or in 
column k, i.e. those of the type <k|V |pa’> or <pa’|V|k>. The scatter- 
ing arising from the other matrix elements of V is of a smaller order 
of magnitude. This suggests that in our exact equations (45) we should 
make the approximation of neglecting all the matrix elements of V 
except the important ones, which are those of the type <pa’|V|k> or 
<k|V|pa’>, where «’ is a state of the scatterer that has not too much 
energy to be disallowed as a final state by the law of conservation of 
energy. These equations then reduce to 


(W'—W)<pa'|H’> = <pa'|V [k><klH’), 
(B'—E,)<k\H"> = & | <k|V |pa’> dp (pa'|H’>, 


the a’ summation being over those values of «’ for which W’ given 
by (18) is > mc*. These equations are now sufficiently simple for us 
to be able to solve exactly without further approximation. 
From the first of equations (46) we obtain by division 
pa’ |H’> = <po"|V|k><k|H")/(W'—W)+A8(W'—W). (47) » 
We must choose A, which may be any function of the momentum 
p and «’, such that (47) represents the incident particles corresponding 
to |0> or h?|p%°) together with only outward moving particles. [The 
representative of http%®> is actually of the form A5(W’—W), since 
the conditions «’ = «® and p = p° for it not to vanish lead to 
W’ = E’—H,(«’) = E’—H,(o°) = W® = W.] Thus (47) must be 
<Po’|H’> = hk pa’|p%>+ 
+<po'|V|k><k| A> /(W’— W)—ir8(W'—W)}, (48) 
and from the general formula (37) the scattering coefficient will be 
Ar? WW’ P’ [he* P®. |< p’a’ |V [> |?| Ck | A’) |?. (49) 


(45) 


(46) 
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It remains for us to determine the value of <k|H’>. We can do this 
by substituting for <pa’|H’> in the second of equations (46) its value 
given by (48). This gives 
(B’—E,)<k| A") = hKK|V | p+ 

+<RIH'D & | [<kIV pa’ 2a/(W'— W)—én 817’ —W)) ap 
= WKk|V |p%)-+ <k|H">(a—1b), 
where a= >| ICk|V | pa’>|? d®p/(W’—W) (50) 


a 


Oe bas 72 | IxkIV po’ P81’ — W) ap 
ee » fff [<k|V|Pwya’>|?8(W’ —W)P2dPsinw dwdy 
Ser 5 arc j J |<k|V|P’wyo’> |? sin w dwdy. (51) 


Thus <k|H’> = hick|V | p%°>/(H’ —H,—a-+7b). (52) 
Note that a and 6 are real and that 6 is positive. 
This value for <k|H’> substituted in (49) gives for the scattering 
coefficient 
4n?*h? WOW'P’ |<p'a! |V b> [71 <k|V p> |? 
One can obtain the total effective area that the incident particle 
must hit in order to be scattered anywhere by integrating (53) over 
all directions of scattering, i.e. by integrating over all directions of 
the vector p’ with its magnitude kept fixed at P’, and then summing 
over all a’ that are to be taken into consideration, i.e. for which 
W’ > mc?. This gives, with the help of (51), the result 
4nh?W b|<k|V|p°x>|? (54) 
cP? (£’—E,—a)?+6* 

If we suppose E’ to vary continuously through the value E,,, the 
main variation of (53) or (54) will be due to the small denominator 
(E’ —E,,—a)?+6. If we neglect the dependence of the other factors 
in (53) and (54) on Z’, then the maximum scattering will occur when 
E' has the value H,-+@ and the scattering will be half its maximum 
when E differs from this value by an amount 6. The large amount of 
scattering that occurs for values of the energy of the incident particle 
that make E’ nearly equal to H, give rise to the phenomenon of an 
absorption line. The centre of the line is displaced by an amount 


(53) 
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a from the resonance energy of the incident particle, i.e. the energy 
which would make the total energy just H,, while the quantity 6 is 
what is sometimes called the half-width of the line. 


53. Emission and absorption 

For studying emission and absorption we must consider non- 
stationary states of the system and must use the perturbation method 
of § 44. To determine the coefficient of spontaneous emission we must 
take an initial state for which the particle is absorbed, corresponding 
to a ket |k>, and determine the probability that at some later time 
the particle shall be on its way to infinity with a definite momentum. 
The method of § 46 can now be applied. From the result (39) of that 
section we see that the probability per unit time per unit range of w 
and x, of the particle being emitted in any direction w’, y’ with the 
scatterer being left in state a’ is 


Qh] <W'ew'x'a’ [Vd |2, (55) 


provided, of course, that «’ is such that the energy W’, given by (18), 
of the particle is greater than mc?. For values of «’ that do not satisfy 
this condition there is no emission possible. The matrix element 
<W'w'y’a'|V|k> here must refer to a representation in which W, w, x, 
and « are diagonal with the weight function unity. The matrix 
elements of V appearing in the three preceding sections refer toa repre- 
sentation in which p,, p,, p, are diagonal with the weight function 
unity, or P, w, x are diagonal with the weight function P2sinw. 
They would thus refer to a representation in which W, w, x are 
diagonal with the weight function dP/dW. P®sinw = WP/c?.sinw. 
Thus the matrix element <W’w’y’a’|V|k> in (55) is equal to 
(W’P’/c?.sin w')* times our previous matrix element <W'a'y'a' |V |k> 
or <p’a'|V|k>, so that (55) is equal to 
= —_ sin w’|¢p'a’ |V |k> 2. 
The probability of emission per unit solid angle per unit time, with 
the scatterer simultaneously dropping to state a’, is thus 
a 2 Kp'e' VIR (56) 
To obtain the total probability per unit time of the particle being 
emitted in any direction, with any final state for the scatterer, we 
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must integrate (56) over all angles w’, y’ and sum over all states ’ 
whose energy H,(a’) is such that H,(«’)+-me? < E,. The result is 
just 2b/%, where 6 is defined by (51). There is thus this simple rela- 
tion between the total emission coefficient and the half-width 6 of the 
absorption line. 

Let us now consider absorption. This requires that we shall take 
an initial state for which the particle is certainly not absorbed but is 
incident with a definite momentum. Thus the ket corresponding to 
the initial state must be of the form (19). We must now determine 
the probability of the particle being absorbed after time ¢. Since our 
final state k is not one of a continuous range, we cannot use directly 
the result (39) of § 46. If, however, we take 


|O> = |p®a®, (57) 
as the ket corresponding to the initial state, the analysis of §§ 44 and 46 


is still applicable as far as equation (36) and shows us that the proba- 
bility of the particle being absorbed into state k after time t is 
2|<k|V |p. |*[1—cos{(E,— #”’ )t/h}]/(H,— BE’). 
This corresponds to a distribution of incident particles of density 
h-*, owing to the omission of the factor h? from (57), as compared 
with (19). The probability of there being an absorption after time 
t when there is one incident particle crossing unit area per unit time 
is therefore 
23 Wc? P®. |<k|V | pa |?[ 1 —cos{(#,— E’)t/h}]/(E,— E’)?. (58) 

To obtain the absorption coefficient we must consider the incident 
particles not all to have exactly the same energy W® = E’—H,(a°), 
but to have a distribution of energy values about the correct value 
E,,—H,({«°) required for absorption. If we take a beam of incident 
particles consisting of one crossing unit area per unit time per unit 
energy range, the probability of there being an absorption after time 
¢ will be given by the integral of (58) with respect to Z’. This integral 
may be evaluated in the same way as (37) of § 46 and is equal to 

4n*h? Wt/c? P®. |<k|V | p%|?. 

The probability per unit time of an absorption taking place with an 


incident beam of one particle per unit area per unit time per unit 
energy range is therefore 


47h? W/c? P®. |<k|V | px» |2, (59) 
which is the absorption coefficient. 
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The connexion between the absorption and emission coefficients 
(59) and (56) and the resonance scattering coefficients calculated in 
the preceding section should be noted. When the incident beam does 
not consist of particles all with the same energy, but consists of a unit 
distribution of particles per unit energy range crossing unit area per 
unit time, the total number of incident particles with energies near 
an absorption line that get scattered will be given by the integral 
of (54) with respect to H’. If one neglects the dependence of the 
numerator of (54) on E’, this integral will, since 

b , 
| waa aareae =* 
—2 
have just the value (59). Thus the total number of scattered particles 
in the neighbourhood of an absorption line is equal to the total number 
absorbed. We can therefore regard all these scattered particles as 
absorbed particles that are subsequently re-emitted in a different 
direction. Further, the number of particles in the neighbourhood of 
the absorption line that get scattered per unit solid angle about a 
given direction specified by p’ and then belong to scatterers in state 
«’ will be given by the integral with respect to H’ of (53), which 
integral has in the same way the value 
242 Wop’ P’ 
STE 2 KXp'a' VIR PICRIV pat) 
This is just equa] to the absorption coefficient (59) multiplied by the 
emission coefficient (56) divided by 2b/h, the total emission coefficient. 
This is in agreement with the point of view of regarding the resonance 
scattered particles as those that are absorbed and then re-emitted, 
with the absorption and emission processes governed independently 
each by its own probability law, since this point of view would: 
make the fraction of the total number of absorbed particles that are 
re-emitted in a unit solid angle about a given direction just the 
emission coefficient for this direction divided by the total emission 
coefficient. 


IX 
SYSTEMS CONTAINING SEVERAL SIMILAR PARTICLES 


54. Symmetrical and antisymmetrical states 

Ir a system in atomic physics contains a number of particles of the 
same kind, e.g. a number of electrons, the particles are absolutely 
indistinguishable one from another. No observable change is made 
when two of them are interchanged. This circumstance gives rise to 
some curious phenomena in quantum mechanics having no analogue 
in the classical theory, which arise from the fact that in quantum 
mechanics a transition may occur resulting in merely the interchange 
of two similar particles, which transition then could not be detected 
by any observational means. A satisfactory theory ought, of course, 
to count two observationally indistinguishable states as the same 
state and to deny that any transition does occur when two similar 
particles exchange places. We shall find that it is possible to reformu- 
late the theory so that this is so. 

Suppose we have a system containing 7 similar particles. We may 
take as our dynamical variables a set of variables ¢, describing the 
first particle, the corresponding set ¢, describing the second particle, 
and so on up to the set €, describing the nth particle. We shall then 
have the €,’s commuting with the €,’s for r ~ s. (We may require 
certain extra variables, describing what the system consists of in 
addition to the 7 similar particles, but it is not necessary to mention 
these explicitly in the present chapter.) The Hamiltonian describing 
the motion of the system will now be expressible as a function of the 
£1, &5;...,€,- The fact that the particles are similar requires that the 
Hamiltonian shall be a symmetrical function of the &,, &,...,€) Le. it 
shall remain unchanged when the sets of variables €, are interchanged 
or permuted in any way. This condition must hold, no matter what 
perturbations are applied to the system. In fact, any quantity of 
physical significance must be a symmetrical function of the ’s. 

Let |a,>, |b,>,... be kets for the first particle considered as a dynami- 
cal system by itself. There will be corresponding kets |a,>, |b,>,... for 
the second particle by itself, and so on. We can get a ket for the 
assembly by taking the product of kets for each particle by itself, 
for example 


|@y>|bg>|Cg>---1Gn> = 141 bg Cg---In> (1) 
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say, according to the notation of (65) of § 20. The ket (1) corresponds 
to a special kind of state for the assembly, which may be described 
by saying that each particle is in its own state, corresponding to its 
own factor on the left-hand side of (1). The general ket for the 
assembly is of the form of a sum or integral of kets like (1), and 
corresponds to a state for the assembly for which one cannot say that 
each particle is in its own state, but only that each particle is partly 
in several states, in a way which is correlated with the other particles 
being partly in several states. If the kets |a,>, |b,>,... are a set of 
basic kets for the first particle by itself, the kets |a,>, |b,>,... will be 
a set of basic kets for the second particle by itself, and so on, and the 
kets (1) will be a set of basic kets for the assembly. We call the repre- 
sentation provided by such basic kets for the assembly a symmetrical 
representation, as it treats all the particles on the same footing. 

In (1) we may interchange the kets for the first two particles and 
get another ket for the assembly, namely 

16,>|@2>|C3>.--19n> = [by 42 C3---Jn>- 

More generally, we may interchange the role of the first two particles 
in any ket for the assembly and get another ket for the assembly. 
The process of interchanging the first two particles is an operator 
which can be applied to kets for the assembly, and is evidently a 
linear operator, of the type dealt with in § 7. Similarly, the process 
of interchanging any pair of particles is a linear operator, and by 
repeated applications of such interchanges we get any permutation 
of the particles appearing as a linear operator which can be applied 
to kets for the assembly. A permutation is called an even permutation 
or an odd permutation according to whether it can be built up from 
an even or an odd number of interchanges. 

A ket for the assembly |X) is called symmetrical if it is ia 
by any permutation, i.e. if P 

P|X> = |X) (2) 

for any permutation P. It is called antisymmetrical if it is unchanged 
by any even permutation and has its sign changed by any odd 


permutation, i.e. if P\|X> = +|X), -) 


the + or — sign being taken according to whether P is even or odd. 
The state corresponding te a symmetrical ket is called a symmetrical 
state, and the state corresponding to an antisymmetrical ket is called 
an antisymmetrical state. In a symmetrical representation, the repre- 
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sentative of a symmetrical ket is a symmetrical function of the 
variables referring to the various particles and the representative of 
an antisymmetrical ket is an antisymmetrical function. 

In the Schrédinger picture, the ket corresponding to a state of the 
assembly will vary with time according to Schrédinger’s equation of 
motion. If it is initially symmetrical it must always remain sym- 
metrical, since, owing to the Hamiltonian being symmetrical, there 
is nothing to disturb the symmetry. Similarly if the ket is initially 
antisymmetrical it must always remain antisymmetrical. Thus a 
state which is initially symmetrical always remains symmetrical and 
a state which is initially antisymmetrical always remains antisym- 
metrical. In consequence, it may be that for a particular kind of 
particle only symmetrical states occur in nature, or only anti- 
symmetrical states occur in nature. If either of these possibilities 
held, it would lead to certain special phenomena for the particles in 
question. 

Let us suppose first that only antisymmetrical states occur in 
nature. The ket (1) is not antisymmetrical and so does not corre- 
spond to a state occurring in nature. From (1) we can in general form 
an antisymmetrical ket by applying all possible permutations to it 
and adding the results, with the coefficient —1 inserted before those 
terms arising from an odd permutation, so as to get 


pe steP NOs GsGa> (4) 


the + or — sign being taken according to whether P is even or odd. 
The ket (4) may be written as a determinant 


uiga: ld. .. - {ap 
a lap - - + (On? 
ley> |¢e> |eg> . - + (Cn? 
I> |Joro a> - + - Fn? 


and its representative in a symmetrical representation is a determi- 
nant. The ket (4) or (5) is not the general antisymmetrical ket, but 
is a specially simple one. It corresponds to a state for the assembly 
for which one can say that certain particle-states, namely the states 
a,b,c,...,g, are occupied, but one cannot say which particle is in 
which state, each particle being equally likely to be in any state. If 
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two of the particle-states a,b,c,...,g are the same, the ket (4) or (5) 
vanishes and does not correspond to any state for the assembly. 
Thus two particles cannot occupy the same state. More generally, the 
occupied states must be all independent, otherwise (4) or (5) vanishes. 
This is an important characteristic of particles for which only anti- 
symmetrical states occur in nature. It leads to a special statistics, 
which was first studied by Fermi, so we shall call particles for which 
only antisymmetrical states occur in nature fermions. 

Let us suppose now that only symmetrical states occur in nature. 
The ket (1) is not symmetrical, except in the special case when all the 
particle-states a,b,c,...,g are the same, but we can always obtain a 
symmetrical ket from it by applying all possible permutations to it - 
and adding the results, so as to get 


p Pa, b3C3...9n- (6) 


The ket (6) is not the general symmetrical ket, but is a specially 
simple one. It corresponds to a state for the assembly for which one 
can say that certain particle-states are occupied, namely the states 
a,b,c,...,9, without being able to say which particle is in which state. 
It is now possible for two or more of the states a, b,c,...,9 to be the 
same, so that two or more particles can be in the same state. In spite 
of this, the statistics of the particles is not the same as the usual 
statistics of the classical theory. The new statistics was first studied 
by Bose, so we shall call particles for which only symmetrical states 
occur in nature bosons. 

We can see the difference of Bose statistics from the usual statistics 
by considering a special case—that of only two particles and only two 
independent states a and 6 for a particle. According to classical 

mechanics, if the assembly of two particles is in thermodynamic 
equilibrium at a high temperature, each particle will be equally likely 
to be in either state. There is thus a probability } of both particles 
being in state a, a probability }+ of both particles being in state b, 
and a probability 4 of one particle being in each state. In the quan- 
tum theory there are three independent symmetrical states for the 
pair of particles, corresponding to the symmetrical kets |a,>|a2>, 
|6,>|6,>, and |a,>|b,>+ {a,>|b,>, and describable as both particles in 
state a, both particles in state b, and one particle in each state 
respectively. For thermodynamic equilibrium at a high temperature 
these three states are equally probable, as was shown in § 33, so that 
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there is a probability 4 of both particles being in state a, a probability 
3 of both particles being in state 6, and a probability 4 of one particle 
being in each state. Thus with Bose statistics the probability of two 
particles being in the same state is greater than with classical statistics. 
Bose statistics differ from classical statistics in the opposite direction 
to Fermi statistics, for which the probability of two particles being 
in the same state is zero. 

In building up a theory of atoms on the lines mentioned at the 
beginning of § 38, to get agreement with experiment one must assume 
that two electrons are never in the same state. This rule is known as 
Pauli’s exclusion principle. It shows us that electrons are fermions. 
Planck’s law of radiation shows us that photons are bosons, as only the 
Bose statistics for photons will lead to Planck’s law. Similarly, for 
each of the other kinds of particle known in physics, there is experi- 
mental evidence to show either that they are fermions, or that they 
are bosons. Protons, neutrons, positrons are fermions, a-particles are 
bosons. It appears that all particles occurring in nature are either 
fermions or bosons, and thus only antisymmetrical or symmetrical 
states for an assembly of similar particles are met with in practice. 
Other more complicated kinds of symmetry are possible mathemati- 
cally, but do not apply to any known particles. With a theory which 
allows only antisymmetrical or only symmetrical states for a particu- 
lar kind of particle, one cannot make a distinction between two states 
which differ only through a permutation of the particles, so that the 
transitions mentioned at the beginning of this section disappear. 


55. Permutations as dynamical variables 

We shall now build up a general theory for a system containing n 
similar particles when states with any kind of symmetry properties 
are allowed, i.e. when there is no restriction to only symmetrical or 
only antisymmetrica] states. The general state now will not be sym- 
metrical or antisymmetrical, nor will it be expressible linearly in 
terms of symmetrical and antisymmetrical states when n > 2. This 
theory will not apply directly to any particles occurring in nature, 
but all the same it is useful for setting up an approximate treatment 
for an assembly of electrons, as will be shown in § 58. 

We have seen that each permutation P of the n particles is a linear 
operator which can be applied to any ket for the assembly. Hence 
“we can regard P as a dynamical variable in our system of n particles. 
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There are n! permutations, each of which can be regarded as a 
dynamical variable. One of them, P, say, is the identical permutation, 
which is equal to unity. The product of any two permutations is a 
third permutation and hence any function of the permutations is 
reducible to a linear function of them. Any permutation P has a 
reciprocal P-! satisfying 


Pe pap pn 


A permutation P can be applied to a bra <X| for the assembly, 
to give another bra, which we shall denote for the present by P<X}. 
If P is applied to both factors of the product <X|Y)>, the product 
must be unchanged, since it is just a number, independent of any 
order of the particles. Thus 


(P<X|)P| Y> = <X| Y> 
showing that PX| Sec xp (7) 
Now P<X| is the conjugate imaginary of P|X) and is thus equal to 
<X|P, and hence from (7) __ 
fo Ps (8) 
Thus a permutation is not in general a real dynamical variable, its 
conjugate complex being equal to its reciprocal. 


Any permutation of the numbers 1, 2, 3,..., may be expressed in 
the cyclic notation, e.g. with n = 8 


PF, = (148)(27)(58)(6), (9) 


in which each number is to be replaced by the succeeding number in 
a bracket, unless it is the last in a bracket, when it is to be replaced 
by the first in that bracket. Thus P, changes the numbers 12345678 
into 47138625. The type of any permutation is specified by the 
partition of the number which is provided by the number of num- 
bers in each of the brackets. Thus the type of P, is specifiedsby the 
partition 8 = 3+2+42+41. Permutations of the same type, i.e. corre- 
sponding to the same partition, we shall call similar. Thus, for 
example, P, in (9) is similar to 


P, = (871)(35)(46)(2). (10) 


The whole of the n! possible permutations may be divided into sets 
of similar permutations, each such set being called a class. The per- 


mutation P, = 1 forms a class by itself. Any permutation is similar 
to its reciprocal. 
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When two permutations P, and P, are similar, either of them B, 
may be obtained by making a certain permutation P,, in the other 
P,. Thus, in our example (9), (10) we can take P, to be the permuta- 
tion that changes 14327586 into 87135462, i.e. the permutation 

P, = (18623)(475). 
Different ways of writing P, and P, in the cyclic notation would lead 
to different P,’s. Any of these P,’s applied to the product P,|X> 
would change it into P,.P,|X), i.e. 


£4, \x> = Ppta|&>- 
Hence fee, (11) 
which expresses the condition for P, and P, to be similar as an 


algebraic equation. The existence of any P, satisfying (11) is suffi- 
cient to show that P, and P, are similar. 


56. Permutations as constants of the motion 

Any symmetrical function V of the dynamical variables of all the 
particles is unchanged by the application of any permutation P, so 
P applied to the product V|X> affects only the factor |X), thus 

Sigg al. Qe) cP. oe 

Hence PV Ver, - (12) 
showing that a symmetrical function of the dynamical variables com- 
mutes with every permutation. The Hamiltonian is a symmetrical 
function of the dynamical variables and thus commutes with every 
permutation. It follows that each permutation is a constant of the 
motion. 'This holds even if the Hamiltonian is not constant. If |X¢ 
is any solution of Schrédinger’s equation of motion, P|Xt> is another. 

In dealing with any system in quantum mechanics, when we have 
found a constant of the motion a, we know that if for any state of 
motion, « initially has the numerical value a’, then it always has this 
value, so that we can assign different numbers a’ to the different 
states and so obtain a classification of the states. The procedure is 
not so straightforward, however, when we have several constants of 
the motion « which do not commute (as is the case with our permuta- 
tions P), since we cannot in general assign numerical values for all 
the «’s simultaneously to any state. Let us first take the case of a 
system whose Hamiltonian does not involve the time explicitly. The 
existence of constants of the motion a which do not commute is 


then a sign that: the system is degenerate. This is because, for a 
3595.57 P 
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non-degenerate system, the Hamiltonian H by itself forms a complete 
set of commuting observables and hence, from Theorem 2 of § 19, each 
of the a’s is a function of H and therefore commutes with any other a. 

We must now look for a function f of the «’s which has one and 
the same numerical value f’ for all those states belonging to one 
energy-level H’, so that we can use f for classifying the energy-levels 
of the system. We can express the condition for 8 by saying that it 
must be a function of H and must therefore commute with every 
dynamical variable that commutes with H, i.e. with every constant 
of the motion. If the a’s are the only constants of the motion, or if 
they are a set that commute with all other independent constants of 
the motion, our problem reduces to finding a function f of the a’s 
which commutes with all the «’s. We can then assign a numerical 
value f’ for 8 to each energy-level of the system. If we can find 
several such functions f, they must all commute with each other, so 
that we can give them all numerical values simultaneously. We ob- 
tain thus a classification of the energy-levels. When the Hamiltonian 
involves the time explicitly one cannot talk about energy-levels, but 
the 6's will still give a useful classification of the states. 

We follow this method in dealing with our permutations P. We 
must find a function y of the P’s such that PyP-1 = x for every P. 
It is evident that a possible x is } P,, the sum of all the permutations 
in a certain class c, i.e. the sum of a set of similar permutations, since 
> PP, P-' must consist of the same permutations summed in a differ- 
ent order. There will be one such x for each class. Further, there can 
be no other independent y, since an arbitrary function of the P’s can 
be expressed as a linear function of them with numerical coefficients, 
and it will not then commute with every P unless the coefficients of 
similar P’s are always the same. We thus obtain all the y’s that can 
be used for classifying the states. It is convenient to define each x as 
an average instead of a sum, thus 


Xe = ne? pa f,, 
where n, is the number of P’s in the class c. An alternative expression 
for x, is Xe = 2! PP, PH, (13) 
P 


the sum being extended over all the n! permutations P, it being easy 
to verify that this sum contains each member of the class c the same 
number of times. For each permutation P there is one x, x(P) say, 
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equal to the average of all permutations similar to P. One of the 
x's is x(A) = 1. 

The constants of the motion x,, x2;..-, Xm Obtained in this way will 
each have a definite numerical value for every stationary state of the 
system, in the case when the Hamiltonian does not involve the time 
explicitly, and also in the general case can be used for classifying 
the states, there being one set of states for every permissible set of 
numerical values xj, %,...,xXm for the x’s. Since the y’s are always 
constants of the motion, these sets of states will be exclusive, i.e. 
transitions will never take place from a state in one set to a state in 
another. 

The permissible sets of values x’ that one can give to the y’s are 
limited by the fact that there exist algebraic relations between the 
x's. The product of any two ’s, xx, is of course expressible as 
a linear function of the P’s, and since it commutes with every P it 
must be expressible as a linear function of the y’s, thus 


Xp Xq = % Xr te Xot-.- +n Xm (14) 


where the a’s are numbers. Any numerical values x’ that one gives 
to the x’s must be eigenvalues of the y’s and must satisfy these same 
algebraic equations. For every solution y’ of these equations there 
is one exclusive set of states. One solution is evidently yx, = 1 for 
every xp, giving the set of symmetrical states. A second obvious 
solution, giving the set of antisymmetrical states, is x, = +1, the 
+ or — sign being taken according to whether the permutations in 
the class p are even or odd. The other solutions may be worked out 
in any special case by ordinary algebraic methods, as the coefficients 
a in (14) may be obtained directly by a consideration of the types 
of permutation to which the y’s concerned refer. Any solution is, 
apart from a certain factor, what is called in group theory a character 
of the group of permutations. The x’s are all real dynamical variables, 
since each P and its conjugate complex P-! are similar and will occur 
added together in the definition of any x, so that the x’’s must be all 
real numbers. 

The number of possible solutions of the equations (14) may easily 
be determined, since it must equal the number of different eigen- 
values of an arbitrary function B of the y’s. We can express B as 
a linear function of the x’s with the help of equations (14); thus 


B = b,x 4b Xe+--- + 8m Xm: (15) 


216 SYSTEMS CONTAINING SEVERAL SIMILAR PARTICLES § 56 


Similarly, we can express each of the quantities B®, B%,.... B™ as a 
linear function of the y’s. From the m equations thus obtained, 
‘together with the equation y(P,) = 1, we can eliminate the m un- 
knowns x1, X9s-+1 Xm» Obtaining as result an algebraic equation of 
degree m for B, 

Bro, B™-1+c, BY-*+-...+¢,, = 0. 


The m solutions of this equation give the m possible eigenvalues 
for B, each of which will, according to (15), be a linear function of },, 
b,,..-, bm Whose coefficients are a permissible set of values x4, X3)--+) Xm- 
The sets of values y’ thus obtained must be all different, since if 
there were fewer than m different permissible sets of values x’ for the 
x’s, there would exist a linear function of the y’s every one of whose 
eigenvalues vanishes, which would mean that the linear function itself 
vanishes and the y’s are not linearly independent. Thus the number of 
permissible sets of numerical values for the x’s is just equal tom, which 
is the number of classes of permutations or the number of partitions 
of n. This number is therefore the number of exclusive sets of states. 

All dynamical variables of physical importance and all observable 
quantities are symmetrical between the particles and thus commute 
with all the P’s. Thus the only functions of the P’s of physical 
importance are the y’s. The states corresponding to |xy’> and to 
F(P)\x’>, where |x’> is any eigenket of the x’s belonging to the eigen- 
values x’ and f(P) is any function of the P’s such that f(P)|x’> 4 0, 
are observationally indistinguishable and are thus physically equiva- 
lent. There is a definite number, n(x’) say, of independent kets which 
can be formed by multiplying |y’> by functions of the P’s, which 
number depends only on the y’’s. It is the number of rows and 
columns in a matrix representation of the P’s in which each y is 
equal to x’. If |x’> corresponds to a stationary state, n(x’) will be 
its degree of degeneracy (so far as concerns degeneracy caused by the 
symmetry between the particles). This degeneracy cannot be removed 
by any perturbation that is symmetrical between the particles. 


57. Determination of the energy-levels 

Let us apply the perturbation method of § 43 and make a first-order 
calculation of the energy-levels in the case when the Hamiltonian 
does not involve the time explicitly. We suppose that for our unper- 
turbed stationary states of the assembly each of the similar particles 
has its own individual state. With n particles, we shall have n of 
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these states, corresponding to kets |a!>, |«?>,..., |a”> say, which we 
assume for the present to be all orthogonal. The ket for the assembly 
is then 

[X> = |aq>|o3>..-lan>s (16) 


like (1) with a, a?,... instead of a, b,.... If we apply any permutation 
P to it we get another ket 


P|X) = |az>|o§>...laz> (17) 
say, 7, S,.... 2 being some permutation of the numbers 1, 2,..., 7, 
corresponding to another stationary state of the assembly with the 
same energy. There are thus altogether n! unperturbed sta‘es with 
this energy, if we assume there are no other causes of degeneracy. 
According to the method of § 43 when the unperturbed system is 
degenerate, we must consider those elements of the matrix represent- 
ing the perturbing energy V that refer to two states with the same 
energy, i.e. those of the type <X |P, VP,|X>. These will form a matrix 
with n! rows and columns, whose eigenvalues are the first-order 
corrections in the energy-levels. 

We must now introduce another kind of permutation operator 
which can be applied to kets of the form (17), namely a permutation 
which acts on the indices of the «’s. We denote such a permutation 
operator by P*. The essential difference between the P’s and the 
P's may be seen in the following way. Let us consider a permutation 
in the general sense, say that consisting of the interchange of 2 and 3. 
This may be interpreted either as the interchange of the objects 2 and 
3 or as the interchange of the objects in the places 2 and 3, these two 
operations producing in general quite different results. The first of 
these interpretations is the one that gives the operators P, the objects 
concerned being the similar particles. A permutation P can be 
applied to an arbitrary ket for the assembly. A permutation with the 
second interpretation has a meaning, however, only when applied 
to a ket of the form (17), for which each of the particles is in a ‘place’ 
specified by an a, or to a sum of kets of the form (17). A permutation 
P may be considered as an ordinary dynamical variable. A permuta- 
tion P* may be considered as a dynamical variable in a restricted 
sense, valid when one is dealing only with states obtainable by super- 
position of the various states (17). This is the case for our present 
perturbation problem. 

We can form algebraic functions of the P* which will be other 
operators applicable to kets of the form (17). In particular we can 
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form x(P2%), the average of all P*’s in a certain class c. This must 
equal x(P,), the average of the permutation operators P in the same 
class, since the total set of all permutations in a given class must 
evidently be the same whether the permutations are applied to the 
- particles or to the places the particles are in. Any P commutes with 


any Pp, Ee. P, Pe = P2P.. (18) 
By labelling the «’s by the same numbers 1, 2, 2,..., » which label 
the particles, we set up a one-one correspondence between the a’s and 
the particles, so that given any permutation P, applying to the par- 
ticles, we can give a meaning to the same permutation P% applying 
to the a’s. This meaning is such that, for the ket |X) given by (16), 
PUF,|X> = |X). (19) 
Since the various kets |a1), |a>,... are orthogonal, |X) and P|X) are 
orthogonal unless P = 1. It follows that, for any coefficients Cp, 


»? Cp(X | P*P,|X> = Cp, (20) 


provided |X> is normalized, the summation being over all the n! 
permutations P or P*, with P, fixed. Now define Vp by 
Vp = (X|VP|X). (21) 
We then have, for any two permutations P, and P,, 
(X|P,VP,|X> = <X|VP,P,|X> = Vp. p, 
= p2 Vp <X| P&P, P, |X) 
with the help of (20). From (18) this gives 
(XP, VP,|X> = ¥ Vp(X|P, P*P,|X). (22) 
We may write this result as 
V = > Vp Pe, . (23) 


where the sign ~ means an equation in a restricted sense, the 
operators on the two sides being equal so long as they are used only 
with kets of the form P|X) and their conjugate imaginary bras. 
The formula (23) shows that the perturbing energy V is equal, in 
the restricted sense, to a linear function of the permutation operators 
P+ with coefficients V, given by (21). The restricted sense is adequate 
for the calculation of the first-order correction in the energy-levels, 
as this calculation involves only those matrix elements of V given by 
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(22). The formula (23) is a very convenient one because the expression 
on its right-hand side is easily handled. 

As an example of an application of (23) we shall determine the 
average energy of all those states, arising from the unperturbed state 
(16), that belong to one exclusive set. This requires us to calculate 
the average eigenvalue of V for those states (17) for which the y’s 
have specified numerical values y’. Now the average eigenvalue of 
P« for any of these states equals that of P*P%(P*%)-} for arbitrary 
P* and thus equals that of ni" > P«p2(P*)-1, which is y'(P%) or 


x'(P,). Hence the average eigenvalue of V is } Vpx'(P). A similar 
P 


method could be used for calculating the average eigenvalue of any 
function of V, it being necessary only to replace each P* by x’(P) to 
perform the averaging. 

The number of energy-levels in an exclusive set x = x’ that arise 
from a given state of the unperturbed system is equal to the number 
of eigenvalues of the right-hand side of (23) that are consistent with 
the equations y = x’. This number is the number n(x’) introduced 
at the end of the preceding section, and is thus just the degree of 
degeneracy of the states in this set. 

We have assumed that the individual kets |«1), |x®),... which deter- ~ 
mine the unperturbed state according to (16) are all orthogonal. The 
theory can easily be extended to the case when some of these kets are 
equal, any two that are not equal being still restricted to be orthogonal. 
We now have some permutations P* such that P*|X> = |X), 
namely those permutations which involve only interchanges of 
equal «’s. Equation (20) will now hold if the summation is extended 
only over those P’s which make P*|X> different. With this change 
in the meaning of >, all the previous equations still hold, including 

P 


the result (23). For the present |X) there will be restrictions on the 
possible numerical values of the x’s, e.g. they cannot have those 
values corresponding to |X being antisymmetrical. 


58. Application to electrons 

Let us consider the case when the similar particles are electrons. 
This requires, according to Pauli’s exclusion principle discussed in 
§ 54, that we take into account only the antisymmetrical states. It 
is now necessary to make explicit reference to the fact that electrons 
have spins, which show themselves through an angular momentum 
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and a magnetic moment. The effect of the spin on the motion of 
an electron in an electromagnetic field is not very great. There 
are additional forces on the electron due to its magnetic moment, 
requiring additional terms in the Hamiltonian. The spin angular 
momentum does not have any direct action on the motion, but it comes 
into play when there are forces tending to rotate the magnetic moment, 
since the magnetic moment and angular momentum are constrained 
to be always in the same direction. In the absence of a strong 
magnetic field these effects are all small, of the same order of magni- 
tude as the corrections required by relativistic mechanics, and there 
would be no point in taking them into account in a non-relativistic 
theory. The importance of the spin lies not in these small effects on the 
motion of the electron, but in the fact that it gives two internal states 
to the electron, corresponding to the two possible values of the spin 
component in any assigned direction, which causes a doubling in the 
number of independent states ofan electron. This fact has far-reaching 
consequences when combined with Pauli’s exclusion principle. 

In dealing with an assembly of electrons we have two kinds of 
dynamical variables. The first kind, which we may call the orbital 
variables, consists of the coordinates x, y, z of all the electrons and 
their conjugate momenta p,, p,, p,. The second kind consists of the 
spin variables, the variables c,,, Oy, O,, aS introduced in § 37, for all 
the electrons. These two kinds of variables belong to different degrees 
of freedom. According to §§ 20 and 21, a ket fixing the state of the 
whole system may be of the form |.A>|.B), where |A) is a ket referring 
to the orbital variables alone and |B) is a ket referring to the spin 
variables alone, and the general ket fixing a state of the whole system 
is a sum or integral of kets of this form. This way of looking at things 
enables us to introduce two kinds of permutation operators, the first 
kind, P* say, applying to the orbital variables only and operating 
only on the factor |A> and the second kind, P? say, applying only 
to the spin variables and operating only on the factor |B). The P2’s 
and P*’s can each be applied to any ket for the whole system, not 
merely to certain special kets, like the P@’s of the preceding section. 
The permutations P that we have had up to the present apply to all 
the dynamical variables of the particles concerned, so for electrons 
they will apply to both the orbital and the spin variables. This means 
that each P, equals the product 

Pe= Pare (24) 
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We can now see the need for taking the spin variables into account 
when applying Pauli’s exclusion principle, even if we neglect the spin 
forces in the Hamiltonian. For any state occurring in nature each 
P, must have the value +1, according to whether it is an even or 
an odd permutation, so from (24) 

faea— +1. (25) 

The theory of the three preceding sections would become trivial if 
applied directly to electrons, for which each P, = +1. We may, 
however, apply it to the P* permutations of electrons. The P°’s are 
constants of the motion if we neglect the terms in the Hamiltonian 
that arise from the spin forces, since this neglect results in the 
Hamiltonian not involving the spin dynamical variables o at all. The 
P*’s must then also be constants of the motion. We can now intro- 
duce new y’s, equal to the average of all of the P*’s in each class, and 
assert that for any permissible set of numerical values x’ for these x’s 
there will be one exclusive set of states. Thus there exist exclusive sets 
of states for systems containing many electrons even when we restrict 
ourselves to a consideration of only those states that satisfy Pauli’s 
principle. The exclusiveness of the sets of states is now, of course, 
only approximate, since the x’s are constants only so long as we 
neglect the spin forces. There will actually be a small probability for 
a transition from a state in one set to a state in another. 

Equation (25) gives us a simple connexion between the P”’s and 
P°’s, which means that instead of studying the dynamical variables 
P= we can get all the results we want, e.g. the characters x’, by 
studying the dynamical variables P’. The P°’s are much easier to 
study on account of there being only two independent states of spin 
for each electron. This fact results in there being fewer characters y’ 
for the group of permutations of the o-variables than for the group 
of general permutations, since it prevents a ket in the spin variables 
from being antisymmetrical in more than two of them. 

The study of the P”’s is made specially easy by the fact that we 
can express them as algebraic functions of the dynamical variables o. 
Consider the quantity 

03= H{1+o,, Oza t Fy Fy2 + Fn Oz} — ${1+ (0, o9)}. 
With the help of equations (50) and (51) of § 37 we find readily that 
(9,, s,)? = (4 x21 Fy1 Cyt xg)” =a 3—2(G, G2), (26) 
and hence that 
0,8 = {1+ 2(e,, 62) +(6;,62)"} = 1. (27) 
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Again, we find 
O42 O21 = HOx1 + F221 a1 Fygtioy: Gz9}5 
O42 O12 = HOx2t On tt0y1 F2—18z5 Oy} 
and hence On64— Cao: 
Similar relations hold for o,, and o,, so that we have 
O12. = 62 O12 
or Ojo, O; =e}: 
From this we can obtain with the help of (27) 
0126, O72 = 9}. 
These commutation relations for O,, with o, and o, are precisely the 


same as those for P’,, the permutation consisting of the interchange 
of the spin variables of electrons 1 and 2. Thus we can put 


= [og 
Oy. = cPh, 


where c is a number. Equation (27) shows that c= +1. To deter- 
mine which of these values for c is the correct one, we observe that 
the eigenvalues of P%, are 1, 1, 1, —1, corresponding to the fact that 
there exist three independent symmetrical and one antisymmetrical 
state in the spin variables of two electrons, namely, with the notation 
of § 37, the states represented by the three symmetrical functions 
Sod) fol2a)s Sp(G21) fp (22) SodGca)fploze) +h p( G21) Sol oz2)» and the one 
antisymmetrical function f,(o%) fp(22) —Sp(Cz1) fo(22)- Thus the mean 
of the eigenvalues of PY, is 4. Now the mean of the eigenvalues of 
(o,, 6,) is evidently zero and hence the mean of the eigenvalues of O,, 
is 4. Thus we must have c = +-1, and so we can put 
Pie = 3{1+ (6, 62)}- (28) 
In this way any permutation P° consisting simply of an interchange 
can be expressed as an algebraic function of the o’s. Any other per- 
mutation P’ can be expressed as a product of interchanges and can 
therefore also be expressed as a function of the o’s. With the help of 
(25) we can now express the P*’s as algebraic functions of the o’s and 
eliminate the P°’s from the discussion. We have, since the — sign 
must be taken in (25) when the permutations are interchanges and 
since the square of an interchange is unity, 


PY, = —3{1+4 (0}, 6,)}. (29) 
The formula (29) may conveniently be used for the evaluation of 
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the characters x’ which define the exclusive sets of states. We have, 
for example, for the permutations consisting of interchanges, 


Xie = x(Pie) = at +r >> (s,, 2) 


If we introduce the dynamical variable s to describe the magnitude of 
the total spin angular momentum, 3 > a, in units of %, through the 
Tr 


formula 
s(s+1) = ($2 e,4 Fo), 
in agreement with (39) of § 36, we have 
2 (s,, G;) = (x G,, 2 3) — 2 (a,, G,) 


= 4s(s-+-1)—3n. 
Hence 
1 4s(s+1)—3n n(n—4)+4s(s+1) 
i sa = 
Thus x,» is expressible as a function of the dynamical variable s and 
of n the number of electrons. Any of the other y’s could be evaluated 
on similar lines and would have to be a function of s and 7 only, since 
there are no other symmetrical functions of all the o dynamical 
variables which could be involved. There is therefore one set of 
numerical values x’ for the x’s, and thus one exclusive set of states, 
for each eigenvalue s’ of s. The eigenvalues of s are 


gn, 4n—1, 4n—Z, ...., 
the series terminating with 0 or 3. 

We see in this way that each of the stationary states of a system 
with several electrons is an eigenstate of s, the magnitude in units of 
h of the total spin angular momentum 3 > a,, belonging to a definite 

T: 


eigenvalue s’. For any given s’ there will be 2s’+-1 possible values 
for a component of the total spin vector in any direction and these 
will correspond to 2s’ 1 independent stationary states with the same 
energy. When we do not neglect the forces due to the spin magnetic 
moments these 2s’-+1 states will in general be split up into 2s’+1 
states with slightly different energies, and will thus form a multiplet 
of multiplicity 2s’-+1. Transitions in which s’ changes, i.e. transitions 
from one multiplicity to another, cannot occur when the spin forces 
are neglected and will have only a small probability of occurrence 
when the spin forces are not neglected. 
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We can determine the energy-levels of a system with several | 
electrons to the first approximation by applying the theory of the 
preceding section with the kets a”) referring only to the orbital 
variables and using formula (23). If we consider only the Coulomb 
forces between the electrons, then the interaction energy V will 
consist of a sum of parts each referring to only two electrons, which 
will result in all the matrix elements V, vanishing except those for 
which P? is the identical permutation or is simply an interchange of 
two electrons. Thus (23) will reduce to 


Vat 3 hePhe (31) 


V,, being the matrix element referring to the interchange of electrons 
r and s. Since the P%’s have the same properties as the P*’s, any 
function of the P*’s will have the same eigenvalues as the corre- 
sponding function of the P*’s, so that the right-hand side of (31) 
will have the same eigenvalues as 


n+ 3h Pee 
or 1= +> V{1 +(o,, 6,)} . (82) 


from (29). The eigenvalues of (32) will give the first-order corrections 
in the energy-levels. The form of (32) shows that a model which 
assumes a coupling energy between the spins of the various electrons, 
of magnitude —4V,,(o,,¢,) for the electrons in the r and s orbital 
states, would meet with a fair amount of success. This coupling 
energy is much greater than that of the spin magnetic moments. Such 
models of the atom were in use before the justification by quantum 
mechanics was obtained. 

We may have two of the orbital states of the unperturbed system 
the same, i.e. the kets |a”> in the orbital variables for two electrons 
may be the same. Suppose |a!) and |«*> are the same. Then we must 
take only those eigenvalues of (31) that are consistent with Pf, = 1, 
or those eigenvalues of (32) that are consistent with Pf, = 1 or 

°, = —1. From (28) this condition gives (6,,¢,) = —3, so that 
(o,-+0,)? = 0. Thus the resultant of the two spins o, and a, is zero, 
which may be interpreted as the spins o, and o, being antiparallel. 
Thus we may say that two electrons in the same orbital state have 
their spins antiparallel. More than two electrons cannot be in the 
same orbital state. 


x 
THEORY OF RADIATION 
59. An assembly of bosons 


We consider a dynamical system composed of w’ similar particles. 
We set up a representation for one of the particles with discrete basic 
kets ja), |a®>, |a®>,.... Then, as explained in § 54, we get a sym- 
metrical representation of the assembly of w’ particles by taking as 
basic kets the products 


Jo>|o8)|a$>...lod-> == laf ag a§--.0> (1) 
in which there is one factor for each particle, the suffixes 1, 2, 3,..., 
of the a’s being the labels of the particles and the indices a, 0, ¢,..., 9 
denoting indices , ®, ®,... in the basic kets for one particle. If the 
particles are bosons, so that only symmetrical states occur in nature, 
then we need to work with only the symmetrical kets that can be 
constructed from the kets (1). The states corresponding to these 
symmetrical kets will form a complete set of states for the assembly 
of bosons. We can build up a theory of them as follows. 
We introduce the linear operator S defined by 
S=u'!+> P, (2) 
the sum being taken over all the w’! permutations of the w’ particles. 
Then S applied to any ket for the assembly gives a symmetrical ket. 
We may therefore call S the symmetrizing operator. From (8) of § 55 
it is real. Applied to the ket (1) it gives 
wlty Plad of of...0%> = Slat%aae...0%>, (3) 
the labels of the particles being omitted on the right-hand side as 
they are no longer relevant. The ket (3) corresponds to a state for 
the assembly of u’ bosons with a definite distribution of the bosons 
among the various boson states, without any particular boson being 
assigned to any particular state. The distribution of bosons is speci- 
fied if we specify how many bosons are in each boson state. Let 
Ny, Nz, Ng,--- be the numbers of bosons in the states ao, a®, o),... 
respectively with this distribution. The ’’s are defined algebraically 
by the equation 
att ot ol+...fa? = no) +13 o)+- 2, a0+.... (4) 
The sum of the n’’s is of course u'. The number of n’’s is equal to 
the number of basic kets |a”>, which in most applications of the 
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theory is very much greater than w’, so most of the ’’s will be zero. 
If a%, a, af,..., «7 are all different, i.e. if the n’’s are all 0 or 1, the 
ket (3) is normalized, since in this case the terms on the left-hand 
side of (3) are all orthogonal to one another and each contributes 
u'!-1 to the squared length of the ket. However, if «2, «, a°,..., af 
are not all different, those terms on the left-hand side of (3) will 
be equal which arise from permutations P which merely interchange 
‘bosons in the same state. The number of equal terms will be 
n;!n! n3!..., so the squared length of the ket (3) will be 
octaPor’...09 |S? |ataa...07> = ny! m3! ng!.... (5) 
For dealing with a general state of the assembly we can introduce 
the numbers 71, m2, 73;... of bosons in the states a, , af),... 
respectively and treat the n’s as dynamical variables or as observ- 
ables. They have the eigenvalues 0, 1, 2,..., wu’. The ket (3) is a 
simultaneous eigenket of all the n’s, belonging to the eigenvalues 
M4, Ng, Ng,.... The various kets (3) form a complete set for the 
dynamical system consisting of u’ bosons, so the n’s all commute 
(see the converse to the theorem of § 13). Further, there is only one 
independent ket (3) belonging to any set of eigenvalues nj, 3, 3,.... 
Hence the n’s form a complete set of commuting observables. If we 
normalize the kets (3) and then label the resulting kets by the 
eigenvalues of the n’s to which they belong, i.e. if we put 
(ny! ng! ng!...)-#8 |ataPa’...o%> = |ni ng ng...), (6) 
we get a set of kets |n, .73...>, with the n’’s taking on all non-negative 
integral values adding up to u’, which kets will form the basic kets 
of a representation with the n’s diagonal. 
The n’s can be expressed as functions of the observables Oy, Og, 
Qg).-+, yr Which define the basic kets of the individual bosons by 
means of the equations 
Ut ieee = Da, ap? i (7) 


or the equations > na flat) = > flo) (8) 


holding for any function f. 

Let us now suppose that the number of bosons in the assembly is 
not given, but is variable. This number is then a dynamical variable 
or observable u, with eigenvalues 0, 1, 2,..., and the ket (3) is an 
eigenket of u belonging to the eigenvalue wu’. To get a complete 
set of kets for our dynamical system we must now take all the 
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symmetrical kets (3) for all values of wu’. We may arrange them in 
onder. thus >, le, Slate, Slate, ..., (9) 


where first is written the ket, with no label, corresponding to the 
state with no bosons present, then come the kets corresponding to 
states with one boson present, then those corresponding to states 
with two bosons, and so on. A general state corresponds to a ket 
which is a sum of the various kets (9). The kets (9) are all orthogonal 
to one another, two kets referring to the same number of bosons being 
orthogonal as before, and two referring to different numbers of bosons 
being orthogonal since they are eigenkets of wu belonging to different 
eigenvalues. By normalizing all the kets (9), we get a set of kets like 
(6) with no restriction on the n’’s (i.e. each n’ taking on all non- 
negative integral values) and these kets form the basic kets of a 
representation with the n’s diagonal for the dynamical system con- 
sisting of a variable number of bosons. 

If there is no interaction between the bosons and if the basic kets 
Jo), |o2),... correspond to stationary states of a boson, the kets (9) 
will correspond to stationary states for the assembly of bosons. The 
number u of bosons is now constant in time, but it need not be a 
specified number, i.e. the general state is a superposition of states 
with various values for u. If the energy of one boson is H(«), the . 
energy of the assembly will be 


Y H(o,) = Im. H* (10) 


from (8), H* being short for the number H(a*). This gives the 
Hamiltonian for the assembly as a function of the dynamical 
variables n. 


60. The connexion between bosons and oscillators 

In § 34 we studied the harmonic oscillator, a dynamical system of 
one degree of freedom describable in terms of a canonical q and p, 
such that the Hamiltonian is a sum of squares of g and p, with 
numerical coefficients. We define a general oscillator mathematically 
as a system of one degree of freedom describable in terms of a 
canonical g and p, such that the Hamiltonian is a power series in q 
and p, and remains so if the system is perturbed in any way. We 
shall now study a dynamical system composed of several of these 
oscillators. We can describe each oscillator in terms of, instead of 
g and p, a complex dynamical variable 7, like the y of § 34, and its 
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conjugate complex 7, satisfying the commutation relation (7) of 
§ 34. We attach labels 1, 2, 3,... to the different oscillators, so that 
the whole set of oscillators is describable in terms of the dynamical 


variables 7;, 42, g++) Mi Te Ag. Satisfying the commutation 
relations qos ee 0. 
Nao Na = 0, (11) 
Na No— Nb Na = Sab 
Put Nata = Na (12) 
so that Tae = Nak. (13) 


The n’s are observables which commute with one another and the 
work of § 34 shows that each of them has as eigenvalues all non- 
negative integers. For the ath oscillator there is a standard ket for 
the Fock representation, |0,> say, which is a normalized eigenket of n, 
belonging to the eigenvalue zero. By multiplying all these standard 
kets together we get a standard ket for the Fock representation for 
the set of oscillators, 10,05) 105+.» (14) 


which is a simultaneous eigenket of all the n’s belonging to the 
eigenvalues zero. We shall denote it simply by |0>. From (13) of § 34 


fjal0> = 0 (15) 
for any a. The work of § 34 also shows that, if nj, n, 73,... are any 
non-negative integers, ft ints...[0> (16) 


is a simultaneous eigenket of all the n’s belonging to the eigenvalues 
M1, Nz, Ng,... respectively. The various kets (16) obtained by taking 
different n’’s form a complete set of kets all orthogonal to one another 
and the square of the length of one of them is, from (16) of § 34, 
M! Nz! ng)... From this we see, bearing in mind the result (5), that 
the kets (16) have just the same properties as the kets (9)} so that 
we can equate each ket (16) to the ket (9) referring to the same n’ 
values without getting any inconsistency. This involves putting 


S|atoPoe...o8> = Na Ny NeNg|0. —F 

The standard ket |0> becomes equal to the first of the kets (9), corre- 
sponding to no bosons present. 

The effect of equation (17) is to identify the states of an assembly 

of bosons with the states of a set of oscillators. This means that the 
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dynamical system consisting of an assembly of similar bosons is equiva- 
lent to the dynamical system consisting of a set of oscillators—the two 
systems are just the same system looked at from two different points of 
view. There is one oscillator associated with each independent boson 
state. We have here one of the most fundamental results of quantum 
mechanics, which enables a unification of the wave and corpuscular 
theories of light to be effected. 

Our work in the preceding section was built up on a discrete set 
of basic kets |«*> for a boson. We could pass to a different discrete 
set of basic kets, [84> say, and build up a similar theory on them. 
The basic kets for the assembly would then be, instead of (9), 


I>, WB*>,» “Sip4p™, Sip4peeey, .... (18) 
The first of the kets (18), referring to no bosons present, is the same 
as the first of the kets (9). Those kets (18) referring to one boson 
present are linear functions of those kets (9) referring to one boson 


present, namely iB4> = ¥ Ja <a%|B4), (19) 


and generally those kets (18) referring to u’ bosons present are linear 
functions of those kets (9) referring to wu’ bosons present. Associated 
with the new basic states |84> for a boson there will be a new set 
of oscillator variables 7 ,, and corresponding to (17) we shall have 


S|B4B"B°...> = nanptc--.|0>. (20) 
Thus a ket 747 g...|0> with w’ factors 74, 7g,... must be a linear func- 
tion of kets y, 7p...|0> with u’ factors 7,, 7,,.... It follows that each 
linear operator 7, must be a linear function of the 7,’s. Equation 


aes nal = ¥ nai><a"iB4> 
and hence wore (21) 


Thus the 7's transform according to the same law as the basic kets for 
a boson. The transformed 7’s satisfy, with their conjugate complexes, 
the same commutation relations (11) as the original ones. The trans- 
formed 7’s are on just the same footing as the original ones and hence, 
when we look upon our dynamical system as a set of oscillators, the 
different degrees of freedom have no invariant significance. 

The 7’s transform according to the same law as the basic bras for 
a boson, and thus the same law as the numbers <a*|x> forming the 
representative of a state x. This similarity people often describe by 

3595.67 Q 
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saying that the 7,’s are given by a process of second quantization 
applied to <«*|x>, meaning thereby that, after one has set up a 
quantum theory for a single particle and so introduced the numbers 
<a*|x> representing a state of the particle, one can make these num- 
bers into linear operators satisfying with their conjugate complexes 
the correct commutation relations, like (11), and one then has the 
appropriate mathematical basis for dealing with an assembly of the 
particles, provided they are bosons. There is a corresponding proce- 
dure for fermions, which will be given in § 65. 

Since an assembly of bosons is the same as a set of oscillators, it 
must be possible to express any symmetrical function of the boson 
variables in terms of the oscillator variables and 7. An example 
of this is provided by equation (10) with 7,7, substituted for m,. 
Let us see how it goes in general. Take first the case of a function 
of the boson variables of the form 


Up a » Ves (22) 


where each U, is a function only of the dynamical variables of the 
rth boson, so that it has a representative <a2|U,|o2> referring to the 
basic kets |x?) of the rth boson. In order that U; may be symmetrical, 
this representative must be the same for all r, so that it can depend 
only on the two eigenvalues labelled by a and 6. We may therefore 


hematin’ <at|U,|a®> = <a|U aby = <a|U |b) (23) 
for brevity. We have 
U, |oft of...) = > lo? of*..a%..><a|U |a,>. (24) 


Summing this equation for all values of r and applying the sym- 
-metrizing operator S to both sides, we get 


SUplog of...) = YY Slafioge..cf..><a|U x. (25) 


Since U; is symmetrical we can replace SU, by U,S andscan then 
substitute for the symmetrica! kets in (25) their values given by (17). 
We get in this way s 


Up Nay Magen |) a p> »2 Na Nes Nay Nays |0><a| U |x,» 
= aX May! ar Nay-|0)8 52,6410 10>, (26) 


nz,. meaning that the factor 7,, must be cancelled out. Now from 
(15) and the commutation relations (11) 


In, Dae | = > Ne Nei Ne Oy ape, (27) 
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(note that 7, is like the operator of partial differentiation 6/éy,), so 
(26) becomes 

Up estar |0> = % Na Fe Mn Nae-+10><2|U |b). (28) 
The kets 7,, 7,,--.|0> form a complete set, and hence we can infer from 
(28) the operator equation 

Up = ¥ 19alU bya, (29) 

This gives us Ul’, in terms of the 7 and 7 variables and the matrix 
elements <a|U |b). 


Now let us take a symmetrical function of the boson variables 
consisting of a sum of terms each referring to two bosons, 


Vp = 2 ie (30) 
We do not need to assume V,, = V,. Corresponding to (23), V,, has 
ANNES <n ekipa cos (ab\V.\ed> (31) 


for brevity. Proceeding as before we get, corresponding to (25), 
VploFoz... = > S Slot o8..0%..0°..)<ab|Via,2,> (32) 


T,S#P 


and corresponding to (26) 
Vee Mee 0) = 5 Mate 2 Mer May Never |0YSex,Paz,6ab|Ved>. (33) 
We can deduce as an extension of (27) 
Te a Nr ire |9> = 2 Mer Nia Ne Nay |O>3 ee, Sax (34) 
so that (33) becomes 
Ver Mas Mey |O> = & Ma Me Fe Aha Nay Neer |O? <ab|V |cd>, 


giving us the operator equation 
Me eee Wie odia: (35) 
aoc 


The method can readily be extended to give any symmetrical func- 
tion of the boson variables in terms of the 7’s and 7’s. 

The foregoing theory can easily be generalized to apply to an 
assembly of bosons in interaction with some other dynamical system, 
which we shall call for definiteness the atom. We must introduce a 
set of basic kets, |¢’> say, for the atom alone. We can then get a set 
of basic kets for the whole system of atom and bosons together by 
multiplying each of the kets |¢’> into each of the kets (9). We may 
write these kets 

(, "|\Sea, “SCR >, S|\Coewae>, .... (36) 
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We may look upon the system as composed of the atom in interaction 
with a set of oscillators, so that it can be described in terms of the 
atom variables and the oscillator variables 7,, 74. Using again the 
standard ket |0) for the set of oscillators, we have 


S[C’ataPo’...> = Na Mh Ne|9>|o">, (37) 
corresponding to (17), as the equation expressing the basic kets 
(36) in terms of the oscillator variables. 

Any function of the atom variables and boson variables which is 
symmetrical between all the bosons is expressible as a function of the 
atom variables and the 7’s and 4’s. Consider first a function Up of 
the form (22) with U, a function only of the atom variables and the 
variables of the rth boson, so that it has a representative <C'a¢| U,| Can: 
This representative must be independent of r in order that Up may 
be symmetrical between all the bosons, so we may write it 
<C'x2|U|0"0>. Now let us define <a|U|b> to be that function of the 
atom variables whose representative is (¢’a*| U|C"a?>, so that we have 

Lar U\L"ab> = <L’at|U|6"a?> = <e’|KalUIb>|0">, (38) 
corresponding to (23). The equations (24)-(28) can now be taken over 
and applied to the present work if both sides of all these equations 
are multiplied by |f’> on the right, with the result that formula (29) 
still holds. We can deal similarly with a symmetrical function V; of 
the form (30) with V,, a function only of the atom variables and the 
variables of the rth and sth bosons. Defining <ab|V|cd> to be that 
function of the atom variables whose representative is 

(CaF af V,5|o" 0% af», 
we find that formula (35) still holds. 
61. Emission and absorption of bosons 

Let us suppose that the oscillators of the preceding section are 
harmonic oscillators and there is no interaction between them. The 
energy of the ath oscillator is then, from (5) of § 34, 

dale = hw Na Nat dia,. 
We shall neglect the constant term }/iw,, which is the energy of the 
oscillator in its lowest state—the so-called ‘zero-point energy’. This 
neglect does not have any dynamical consequences, as explained at 
the beginning of § 30, and merely involves a redefinition of H,. The 
total energy of all the oscillators is now 


Ay a > HW, as > hwy Na i. > hwy Na (39) 
@ a a 
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with the help of (12). This is of the same form as (10), with iw, for 
H*. Thus a set of harmonic oscillators is equivalent to an assembly cf 
bosons in stationary states with no interaction between them. If an 
oscillator of the set is in its n'th quantum state, there are n’ bosons in 
the associated boson state. 
In general the Hamiltonian for the set of oscillators will be a power 
series in the variables 7,, 7,, say 


Hy, = Hp+ 2 (Uanat Ua iia) + ” (Up nate Veonatot Von iaite) ++.» 
(40) 
where Hp, U,, Uj», V,,, are numbers, Hp being real and U,, = U,,. If 
the set of oscillators are in interaction with an atom, as we had at 
the end of the preceding section, the total Hamiltonian will still be 
of the form (40), with Hp, U,, U,,, V,, functions of the atom variables, 
Hp in particular being the Hamiltonian for the atom by itself. A 
general treatment of this dynamical system would be rather compli- 
cated and for practical applications one assumes that the terms 


Ap+ Po, OG Na (41) 


are large compared with the others and form by themselves an 
unperturbed system, the remaining terms being taken into account 
as a perturbation producing transitions in the unperturbed system, 
according to the theory of § 44. If, further, U,, is independent of the 
atom variables, the unperturbed system with Hamiltonian (41) con- 
sists merely of an atom with Hamiltonian Hp and an assembly of 
bosons in stationary states with Hamiltonian of the form (39), with 
no interaction. 
Let us consider what kinds of transitions are produced by the 
various perturbation terms in (40). Take a stationary state of the 
unperturbed system for which the atom i3 ina stationary state, ¢’ say, 


and bosons are present in the stationary boson states, @, 5, ¢,.... This 
stationary state for the unperturbed system corresponds to the ket 
aD Ne [OD Ie, ; (42) 


like (37). If the term U,7, of (40) is multiplied into this ket, the 
result is a linear combination of kets like : 
Qe $,, Nh Qerss \O>f", (43) 


t” denoting any stationary state of the atom. The ket (43) refers to 
one more boson than the ket (42), the extra boson being in the state x. 
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Thus the perturbation term U, 7, gives rise to transitions in which 
one boson is emitted into state x and the atom makes an arbitrary 
jump. If the term U,,7, of (40) is multiplied into (42), the result is 
zero unless (42) contains a factor n, and is then a linear combination 


ofa Le 15 ao Ter |0>18">, 


referring to one boson less in state x. Thus the perturbation term 
U,,7, gives rise to transitions in which one boson is absorbed from 
state x, the atom again making an arbitrary jump. Similarly, we find 
that a perturbation term U,,, 7,7, (« ¢ y) gives rise to processes in’ 
which a boson is absorbed from state y and one is emitted into state 
x, or, what is the same thing physically, one boson makes a transition 
from state y to state x. This kind of process would be produced by 
a term like the U, of (22) and (29) in the perturbation energy, pro- 
vided the diagonal elements <a|U|a> vanish. Again, the perturbation 
terms Vy 12 Ny» Vey Ne Ty give rise to processes in which two bosons are 
emitted or absorbed, and so on for more complicated terms. With 
any of these emission and absorption processes the atom can make 
an arbitrary jump. 

Let us determine how the probability of occurrence of each of these 
transition processes depends on the numbers of bosons originally 
present in the various boson states. From §§ 44, 46 the transition 
probability is always proportional to the square of the modulus of 
the matrix element of the perturbation energy referring to the two 
states concerned. Thus the probability of a boson being emitted into 
state x with the atom making a jump from state @’ to state ¢” is 
proportional to 


<0" |my ng..(m,+1)..|U, 12 |My M9. |0’>]?, (44) 


the n”’s being the numbers of bosons initially present in the various 
boson states. Now from (6) and (17), with reference to (4), + 


[221 My Mg.-.> = (m4! 5! mg!...) “yh mgt Gs...|0), (45) 

so that Nz |24 Na. > = (+1)? [04 N9..(0,+1)..>. (46) 
Hence (44) is equal to 

(Me+1)|<O"[U12>1?, (47) 


showing that the probability of a transition in which a boson is emitted 
into state x 1s proportional to the number of bosons originally in state x 
plus one. 
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The probability of a boson being absorbed from state x with the 
atom making a jump from state ¢’ to state ¢” is proportional to 


<5" [<4 23.-(M2—1)..|T FglMy Ma--Me» 1617, (48) 
the n’’s again being the numbers of bosons initially present in the 
various boson states. Now from (45) 


ely my..m..) = nib ny ni..(M,—1)..), (49) 
so (48) is equal to jc" |0,|6'>|2. (50) 
Thus the probability of a transition in which a boson is absorbed from 
state x is proportional to the number of bosons originally in state x. 
Similar methods may be applied to more complicated processes, 
and show that the probability of a process in which a boson makes 
a transition from state y to state x (x + y)is proportional to n,(n,+ 1). 
More generally, the probability of a process in which bosons are 
absorbed from states x, y,... and emitted into states a, b,... is propor- 


tional to ee ’ , 

rity. (Mg + L(t 1) (51) 
the n’’s being in each case the numbers of bosons originally present. 
These results hold both for direct transition processes and transition 


processes that take place through one or more intermediate states, 
in accordance with the interpretation given at the end of § 44. 


62. Application to photons 

Since photons are bosons, the foregoing theory can be applied to 
them. A photon is in a stationary state when it is in an eigenstate 
of momentum. It then has two independent states of polarization, 
which may be taken to be two perpendicular states of linear polariza- 
tion. The dynamical variables needed to describe the stationary 
states are then the momentum p, a vector, and a polarization variable 
1, consisting of a unit vector perpendicular to p. The variables p and 
1 take the place of our previous a’s. The eigenvalues of p consist of 
all numbers from —0o to 00 for each of the three Cartesian com- 
ponents of p, while for each eigenvalue p’ of p, 1 has just two 
eigenvalues, namely two arbitrarily chosen vectors perpendicular 
to p’ and to one another. Owing to the eigenvalues of p forming 
a continuous range, there are a continuous range of stationary 
states, giving us the continuous basic kets |p’I’). However, the fore- 
going theory was built up in terms of discrete basic kets |a’> for a 
boson. There are two formalisms which one may use for getting over 
this discrepancy. 
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The first consists in replacing the continuous three-dimensional 
distribution of eigenvalues for p by a large number of discrete points 
lying very close together, forming a dust spread over the whole three- 
dimensional p-space. Let sp be the density of the dust (the number 
of points per unit volume) in the neighbourhood of any point p’. 
Then sp, must be large and positive, but is otherwise an arbitrary 
function of p’. An integral over the p-space may be replaced by a 
sum over the dust of points, in accordance with the formula 


[J fiw dp, dp, dp, = X f(P')sp", (52) 


which formula provides the basis of the passage from continuous p’ 
values to discrete ones and vice versa. Any problem can be worked 
out in terms of the discrete p’ values, for which the theory of §§ 59-61 
can be used, and the results can be transformed back to refer to con- 
tinuous p’ values. The arbitrary density s, should then disappear 
from the results. 

The second formalism consists in modifying the equations of the 
theory of §§ 59-61 so as to make them apply to the case of a con- 
tinuous range of basic kets |a’>, by replacing sums by integrals and 
replacing the 6 symbol in the commutation relations (11) by 5 func- 
tions, so far as concerns the variables with continuous eigenvalues. 
Each of these formalisms has some advantages and some disadvan- 
tages. The first is usually more convenient for physical discussion, 
the second for mathematical development. Both will be developed 
here and one or other will be used according to which is more suitable 
at the moment. 

The Hamiltonian describing an assembly of photons interacting 
with an atom will be of the general form (40), with the coefficients 
Hp, U,, U,y, Vz, involving the atom variables. This Hamiltonian may 


be written jg pee an (53) 


where Hp is the energy of the atom alone, H 7 is the energy of the 
assembly of photons alone, 


Ip = >To hyp, (54) 


vp being the frequency of a photon of momentum p’, and Hg is the 
interaction energy, which can be evaluated from analogy with the 
classical theory, as will be shown in the next section. The whole 
system can be treated by a perturbation method as discussed in the 
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preceding section, Hp, and H, providing the energy (41) of the 
unperturbed system and Hy being the perturbation energy, which 
gives rise to transition processes in which photons are emitted and 
absorbed and the atom jumps from one stationary state to another. 

We saw in the preceding section that the probability of an absorp- 
tion process is proportional to the number of bosons originally in the 
state from which a boson is absorbed. From this we can infer that 
the probability of a photon being absorbed from a beam of radiation 
incident on an atom is proportional to the intensity of the beam. 
We also saw that the probability of an emission process is propor- 
tional to the number of bosons originally in the state concerned plus 
one. To interpret this result we must make a careful study of the 
relations involved in replacing the continuous range of photon states 
by a discrete set. 

Let us neglevt for the present the polarization variable 1. Let 
|p’D> be the normalized ket corresponding to the discrete photon 
state p’. Then from (22) of § 16 


Piney, 
which gives from (52) 
| Ip'D><p'DIs, dp’ = 1, (55) 
d'p’ being written for dp;,dp,,dp,, for brevity. Now if |p’) is the basic 
ket corresponding to the continuous state p’, we have according to 
24) of § 16 ve 
_ { IP><p'la@p" = 1, 
which shows, on comparison with (55); that — 
Ip’> = |p'D>sp.. (56) 
The connexion between |p’> and |p’pD) is like the connexion between 
the basic kets when one changes the weight function of the representa- 
tion, as shown hy (38) of § 16. 


With n,, photons in each discrete photon state p’, the Gibbs 
density p for the assembly of photons is, according to (68) of § 33, 


p= 3 [p'pynp/P'v| = | ip’pyn,<p’p|s,, dp’ 
2 
= [ Ip’ymp"| dp’ (57) 


with the help of (56). The number of photons per unit volume in the 
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neighbourhood of any point x’ is then <x’|p|x’>, according to (73) 
of § 33. From (57) this equals 


<x'|p|x’> = [ <x’ [p)mp<p'|x’> dp’ 
= | h-n,, dp’ (58) . 


if one puts in the value of the transformation function ¢x’|p’> given 
by (54) of § 23. Equation (58) expresses the number of photons per 
unit volume as an integral over the momentum space, so the inte- 
grand in (58) can be interpreted as the number of photons per unit 
of phase space. We obtain in this way the result that the nwmber of 
photons per unit of phase space is equal to h- times the number of 
photons per discrete state, in other words, a cell of volume h® in phase 
space 1s equivalent to a discrete state. This result is a general one, 
holding for any kind of particle. If the polarization variable of the 
photons is not neglected, the result holds for each of the two indepen- 
dent states of polarization. 

The momentum of a photon of frequency v is of magnitude hy/c, 
so the element of momentum space 

dp, dp, dp, = h8c-y? dvdw, 

dw being an element of solid angle for the direction of the vector p. 
Thus a distribution of photons with n, per discrete state, which is 
equivalent to a distribution of h-n},d*pd'x photons in an element 
of volume d’x and an element of momentum space d3p, equals a 
distribution of n, c~*v? dyvdwd3x photons in an element of volume dz 
and a frequency range dv ana direction of motion dw. This corre- 
sponds to an energy density n, hc-*v* per unit solid angle per unit 
frequency range, or an intensity per unit frequency range (i.e. an 
energy crossing unit area per unit time per unit frequency range) of 
— I, = ny he[o?. (59) 

The result that the probability of a photon being emitted is pro- 
portional to n,,+1, ”,, being the number of photons initially present 
in the discrete state concerned, can now be interpreted as the proba- 
bility being proportional to [,,+/1/c?, where J,, is the intensity of 
the incident radiation per unit frequency range in the neighbourhood 
of the frequency of the emitted photon and having the same polariza- 
tion 1 as the emitted photon. Thus with no incident radiation there 
is still a certain amount of emission, but the emission is increased or 
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stimulated by incident radiation in the same direction and having the 
same frequency and polarization as the emitted radiation. The 
present theory of radiation thus completes the imperfect one of § 45 
by giving both stimulated and spontaneous emission. The ratio it 
gives for the two kinds of emission, namely J, : hv5/c?, is in agreement 
with that provided by Einstein’s theory of statistical equilibrium 
mentioned in § 45. 

The probability of a photon being scattered from the state p’l’ to 
the state p’l” is proportional to Npy(Npy-+1), the n’s being the 
numbers of photons initially in the discrete states concerned. We can 
interpret this result as the creme! being proportional to 

Ty Tye" 02). (60) 
Similarly for a more a radiative process in which several 
photons are emitted and absorbed, the probability is proportional 
to a factor J, for each absorbed photon and a factor J,,+/.4/c? for 
each emitted photon. Thus the process is stimulated by incident 
radiation in the same direction and with the same frequency and 
polarization as any of the emitted photons. 


63. The interaction energy between photons and an atom 

We shall now determine the interaction energy between an atom 
and an assembly of photons, ie. the Hg of equation (53), from 
analogy with the classical expression for the interaction energy 
between an atom and a field of radiation. For simplicity we shall 
suppose the atom to consist of a single electron moving in an electro- 
static field of force. The field of radiation may be described by a 
scalar and a vector potential. These potentials are to a certain extent 
arbitrary and may be chosen so that the scalar potential vanishes. 
The field is then completely described by the vector potential A,, A,, 
A,, or A. The change that the field causes in the Hamiltonian 
describing the atom is now, as explained at the beginning of § 41, 


ii" Be tN ae Ah 48 61 
o= gal (P+; —p| =< 0p, a . (61) 


This is the classical interaction energy. The A that occurs here should 
be the value of the vector potential at the point where the electron is — 
momentarily situated. It is, however, a good enough approximation 
if we take this A to be the vector potential at some fixed point in the 
atom, such as the nucleus, provided we are dealing with radiation 
whose wavelength is large compared with the dimensions of the atom. 
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Let us first consider the field of radiation classically and ignore its 
interaction with the atom. The vector potential A satisfies, according 
to Maxwell’s theory, the equations 


OA=0, divA=0, (62) 


C1 being short for 6?/c? ot? 6?/dx?— 6?/éy?— 62/027. The first of these 
equations shows that A can be resolved into Fourier components in 
the form 


a I {A,, en te +2rivat + A gill) —2rrinel} dk, (63) 


each Fourier component representing a train of waves moving with 
the velocity of light, described by a vector k whose direction gives 
the direction of motion of the waves and whose magnitude |k| is 
connected with their frequency vy, by ; 


 2arvy, = clk]. (64) 
The vector k is just the momentum of a photon which the quantum 
theory would associate with these waves, divided by %. For each 
value of k we have an amplitude A,, which is in general a complex 


vector, and the integral in (63) extends over the whole of the three- 
dimensional k-space. The second of equations (62) gives . 


(k, Ax) = 0, (65) 
showing that, for each value of k, A, is perpendicular to k. This 
expresses that the waves are transverse waves. A, is determined by 
its two components in two directions perpendicular to each other and 


to k, thesé two components corresponding to two independent states 
of linear polarization. 


The total energy of the radiation is given by the volume integral 


Hy, == (87)" i (E94 HF) Be (66) 


taken over the whole of space, where the electric field € and the 
magnetic field # of the radiation are given by 


1a 
é —— mars a’ F = curl A. (67) 


Using standard formulas of vector analysis, we have 
div[A x #] = (#, curl A)—(A, curl #) = #?2—(A, curl curl A) 
== #?+(A, V?A) 
with the help of the second of equations (62). Thus (66) becomes, 
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with neglect of a term which can be transformed to a surface integral 
at infinity, 
(1/oA oA 
Hy = (8m)-1 | |(=*, [*\_(A, Vea)! ax. 68 
wm (8m) | LEE Sa, vray) ae (68) 
By substituting for A here its value given by (63), we can get the 
energy of the radiation in terms of the Fourier amplitudes A,. The 
energy of the radiation is constant (since we are now ignoring the 
interaction of the radiation and the atom), so in this calculation we 


may take ¢ = 0. This means taking 
= | (A, +A_,)e-t™ dk, (69) 
VA = — [ kA, +A_,)e™ @, 
eA/at = ic { k|(A,—A_,)e~# d8k. (70) 
Inserting these expressions in (68), we ge} 
Hy = (87) [[[ {k(A, +A 4, Ay t+ Ay)— 
—|k||k’|(A,—A_,, Ay — A_, jee 1k d8hed8k'd8x 
=a [fA +A Ata) — 
— [kej|k’\(Ay—A_g, Ay Ay )}8(k+k') BRA’, 


with the help of formula (49) of §23, 8(k+k’) being the product of 
three factors, one for each component of k. Hence 


Hp = 7? i k?{(Ay +A_,, A-,+A,)—(A,—A4,A+4—A,)} Bh 
= Qn? | k(A,, A,)+(A_,, A-4)} Pk 
= aR i k?(A,, A,) dk. (71) 


We can replace the continuous distribution of k-values by a dust of 
discrete k-values, like we did with the p-values in the preceding 
section. The integral (71) then goes over, according to formula (52), 


into the sum Hp = 47? x k2(A,, A,)sgt, 


8, being the density of the discrete k-values. We may also write 
this as Hp = 4n? s KA, Ay Skt, (72) 
k 


A,, being a component of A, in a direction 1 perpendicular to k and 
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the summation with respect to 1 referring to two directions 1 perpen- 
dicular to each other. Thus there is one term in (72) for each inde- 
pendent stationary state for a photon. 

The field quantities € and # at any point x can be looked upon 
as dynamical variables. The quantities 


pe ivy f A ea ~Qrivyl 
Ayy = Ayye?7™r!, Ayy = Aye * 


are then dynamical variables at time t, since they are connected with 
€ and # at various points x at time ¢ by equations which do not 
involve ¢, as follows from (63) and (67). A,, is constant, so A, varies 
with ¢ according to the simple harmonic law. Thus A,, is like the 7, 
of a harmonic oscillator, defined by (3) of § 34, the w of the oscillator 
being 271,. We may take each A,,, to be proportional to the n, of 
some harmonic oscillator and then the field of radiation becomes a 
set of harmonic oscillators. 

Let us now pass over to the quantum theory and take the Ana, 
to be dynamical variables in the Heisenberg picture. The expression 
(72) for the energy may be retained unchanged, the order in which 
the factors A,,, A,, there occur being the correct one to give no zero- 
point energy. The A,, then still vary with time according to the e** 
law and may still be taken to be proportional to the 7s of harmonic 
oscillators. The factor of proportionality may be obtained by equat- 
ing (72) to the expression (39) for the energy, with the label a replaced 
by the two labels k and 1 and with Ay, for hw,. This gives 


eye D2 KA Ayyse! = D2 hry Meu Mews 


the suffix t being inserted to show that we are dealing with Heisenberg 
dynamical variables (as we should when transferring equations of the 
classical theory to the quantum theory). Hence, using (64), 


An Ayy = chtygtnyy sk, (73) 


with neglect of an unimportant arbitrary phase factor. In this way 
the Heisenberg dynamical variables 7,,,, which describe the field of 
radiation as a set of oscillators, are introduced. The commutation 
relations between the y,y and 7j,y are known, being given by (11), so 
equation (73) fixes the commutation relations between the Ayy and 
A,,. It thus fixes the commutation relations between the potentials 
A and the field quantities € and # at various points x at the time ¢. 


(Incidentally, the commutation relations of the A,,, Ay are fixed, 
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so the commutation relation of two potential or field quantities at 
two different times is also fixed.) 

We can still use (73) when the interaction between the field of 
radiation and the atom is taken into account. This involves assuming 
that the interaction does not affect the commutation relations 
between the potentials and field quantities at a given time. The 
interaction causes the 7,,’s to cease to vary according to the simple 
harmonic law and the oscillators to cease to be harmonic. Thus it 
may affect the commutation relation between two potential or field 
quantities at two different times. 

We can now take over the interaction energy (61) into the quantum 
theory, putting p; for p to show it is a Heisenberg dynamical variable. 
Taking the atomic nucleus to be at the origin we get, by substituting 
(63) with x = 0 into (61). 


é 
Ho == | (Pp Ae t Ags) k+ 
e2 
2mc? 


= — Ds (Py, Ay, + Ay)sp?+ 


% | { (Ait Aus Avrt Ay) Pkask’ 


2 


> (Ag+ Ags, Avrt+ Anis tse 


2 
2mec a 


if we pass from continuous to discrete k-values. Thus 


Ho, = — > PuAuutAu)siect+ « 


ge A A i 
+ gppct D,, Arar Aa) Auret Ard Me 
py being the component of p, in the direction 1. With the help of (73) 
we may express Ho, in terms of the yyy and Fy, and we can then drop 
the suffix ¢ (which means going over to Schrédinger dynamical 


variables), so that we obtain finally 
ht oe iad 
Hg = cs py PrMic* (Mert Fact) Sie + 


eh 
3274m 


KEI 

With the model of the atom we are using, the interaction energy 
appears as a linear plus a quadratic function in the 7’s and 4’s. The 
linear terms give rise to emission and absorption processes, the 


vig tye (Mr + Hua) Mer + Awv)(I)setset. (74) 
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quadratic ones to scattering processes and processes in which two 
photons are absorbed or emitted simultaneously. The order of the 
factors 7 and 4 in the quadratic terms is not determined by the 
procedure of working from the classical theory, but this order is 
unimportant, since a change in it merely changes Hg by a constant. 

The matrix element of Hg referring to the emission of a photon 
into the discrete state kl, or into the discrete state p'l, as it may also 
be labelled, with the atom jumping from state «® to state a’, is 


se 


<p’Dla’ |g |x°> = ea Ko" |p, |0>se? = ay; <a’ |p, |x° a5 : 


=a 
since s, = Sp)f%. The p, oe here, referring to the momentum 
of the electron, is, of course, quite distinct from the other letters p, 
referring to the momentum of the emitted photon. To avoid con- 
fusion we shall replace the electron momentum p by mx, these two 
dynamical variables being the same for the unperturbed atom. Pass- 
ing over to continuous photon states by means of the conjugate 
imaginary of equation (56), we get 


<P'la"|H|a°> = Kot" |ay|%9>. (75) 


e 
h(2nv')! 
Similarly, the matrix element of Hg referring to the absorption of a 
photon from the continuous state p°l with the atom jumping from 
state a° to state «’ is 

, 0] 0 Sige 
and the matrix element referring to the cncaee of a photon from 
the continuous state p°l® to the continuous state p’l’ with the atom 
jumping from state « to state a’ is 

| (p'l'a'|Hg |p!) = — 


sai poly eagle) Sy’ . (77) 


there being two terms in (74) which contribute to it. These matrix 
elements will be used in the next section. The matrix elements 
referring to the simultaneous absorption or emission of two photons 


may be written down in the same way, but they lead to physical 
effects too small to be of practical importance. 


64. Emission, absorption, and scattering of radiation 
We can now determine directly the coefficients of emission, absorp- 
tion, and scattering of radiation by substituting in the formulas of 
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Chapter VIII the values for the matrix elements given by (75), (76), 
and (77). 

For determining the emission probability we can use formula 
(56) of §53. This shows that for an atom in a state o° the proba- 
bility per unit time per unit solid angle of its spontaneously emitting 
a photon and dropping to a state «’ of lower energy is 

4n? WPle 1 
hl C8 th (2m) 
Now the energy and momentum of a photon of frequency v are 
W =h, Peiiie. 
Again, from the Heisenberg law (20) of § 29, 
Ka" |t|0°> = —2rtv(a%xr’) <a’ |a,|0, 
v(x°«’) being the frequency connected with transitions from state «® 
to state «’, which in the present case is just the frequency v of the 
emitted radiation. These results substituted in (78) make the emis- 
sion coefficient reduce to 
(2zv)8 
he’ 
To obtain the rate of emission of energy per unit solid angle for a 
specified polarization, we must multiply this by Av. This gives for 
the total rate of emission of energy in all directions 
4 (2rv 
3 8 
which is in agreement with expression (34) of § 45 and justifies Heisen- 
berg’s assumption for the interpretation of his matrix elements. 

In the same way the absorption coefficient, given by formula 
(59) of § 53, becomes for photons 

4n*h?*Wle 1 aes 2 8p 

a item <” |% [a>] = z 
This absorption coefficient refers to an incident beam of one photon 
crossing unit area per unit time per unit energy range. If we take 
one per unit frequency range instead of energy range, as is usual 
when dealing with radiation, the absorption coefficient becomes 


8273 ; 
Fe <a lela 


This result is the same as (32) of § 45, if we substitute for the E, 


there the energy Av of a single photon. Thus the elementary theory 
3595 67 R 


Cal [iy] | (78) 


| <ov’ ear, |x>|?. (79) 


Ne ecu 
|<a’ |ex|a°>|?, (80) 


| <ex" ear, |x) |?. 
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of § 45, in which the radiation field is treated as an external perturba- 
tion, gives the correct value for the absorption coefficient. 

This agreement between the elementary theory and the present 
theory could be inferred from general arguments. The two theories 
differ only in that the field quantities all commute with one another 
in the elementary theory and satisfy definite commutation relations 
in the present theory, and this difference becomes unimportant for 
strong fields. Thus the two theories must give the same absorption 
and emission when strong fields are concerned. Since both theories 
give the rate of absorption proportional to the intensity of the inci- 
dent beam, the agreement must hold also for weak fields in the case of 
absorption. In the same way the stimulated part of the emission in the 
present theory must agree with the emission in the elementary theory. 

Let us now consider scattering. The direct scattering coefficient is 
given by formula (38) of § 50. Such scattering of photons will not be 
accompanied by any change of state of the atom on account of the 
factor 5,/.0 in the expression for the matrix element (77). Thus the 
final energy W’ of the photon will equal its initial energy W°. The 
scattering coefficient now reduces to 


et/m2c4. (1'19)2, 


This is the same as that given by classical mechanics for the scattering 
of radiation by a free electron. We thus see that the direct scatter- 
ing of radiation by an electron in an atom is independent of the atom 
and is correctly given by the classical theory. This result, it should 
be remembered, holds only provided the wavelength of the radiation 
is large compared with the dimensions of the atom. 

The direct scattering is a mathematical concept and cannot be 
separated out experimentally from the total scattering, given by 
formula (44) of § 51. Let us see what this total scattering is in the 
case of photons. We must be careful in our application of formula 
(44) of § 51. The summation p in this formula may be considered as 


representing the contribution to the scattering of double transitions 
consisting of transitions firstly from the initial state to state k and 
secondly from state k to the final state. The first transition may be 
an absorption of the incident photon and the second an emission of 
the required scattered photon, but it is also possible for the first 
transition to be the emission and the second the absorption. It is 
clear from the general nature of the method used for deriving formula 
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(44) of § 51 that both these kinds of double transitions must be in- 
cluded in the summation § when this formula is applied to photons, 
ke 


although only the first of them appears in the actual derivation given 
in § 51, as the possibility of the particle being created or annihilated 
was not taken into account there. 

We use zero, single prime, and double prime to refer to the initial, 
final, and intermediate states of the atom respectively, and zero and 
single prime to refer to the absorbed and emitted photons respec- 
tively. Then, for the double transition of absorption followed by 
emission, we must take for the matrix elements 


<k|V|p%, - <p’a'|V|k> 

of the formula (44) of § 51 

<k|V | p°a®> = <a"|Ho|p la, — <p’a’|Vik> = <p'l’a’ |Hg|a">. 
Also = E’ — E,, = hv + Hp(o°)— Ap(a") = mal a°)], 
where hy(a"a°) = Hp(x")—HAp(a°). 
Similarly, for the double transition of emission followed by absorption 
we must take 

<k|V|p%> = <p'l’a"|Hg|a, «. <p’a'|V|k> = <a’|Hg|pl°a”> 
and 

E’—E, = hv+ Hp(«°)— Ap(«")—hyY—hv’ = —h[v'+(2"0)], 

there being now two photons, of frequencies v°® and v’, in existence 
for the intermediate state. Substituting in (44) of § 51 the values of 
the matrix elements given by (75), (76), and (77), we get for the 
scattering coefficient 


et lope, 
hect ae ec, (I 1°) Oueet 
<a’ - lee Soe |x oja®> ox" |X0 |x” > <0” || “}y 
+ v5 ; —v(a"o® 0) vy’ +v(a"a°) _ 
If we write (81) in terms of z instead of z, we get 

(Qrre)* v’] h V ee Cot" |aty |x” > <0x” |2t40 0° 

Sar wap UT) Bae — Z v(e'en” (ce a = 
_<' aldol *» (82) 

v’+y(a"o) ; 


We can simplify (82) with the help of the quantum conditions. 


We have Ly Lo—Xpoxy = 0, 
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which gives 


DY {4 on" ary |ex”> <2" [aro — Car" yo ax” <a" ary |x} = 0, (83) 


and also : 
Ly Lp— Lp ty = I/m. (Xy Pyo— Py Vy) a ih/m. (iP), 
which gives 
D {dew lay ae") . (01%) a” ayo |a29> v(x" sie cae al" ry |a.°>} 
lL. thers 
= ? ee —— (he o (84 
a MUI) a p= pa APS ye (84) 


Multiplying (83) by v’ and adding to (84), we obtain 

> {Xa |aty |e” ><” [ayo |>[v’ +- (cx) ]— <x” [arya |x” >< 0” [ary |’ v(x’ or "\]} 
= h/2am. (V1) 8 y-q0. 

If we substitute this expression for #/27m.(I’l°) 8,4 in (82), we 


obtain, after a straightforward reduction making use of identical 
relations between the v’s, 


(277)* os fs [ay ou"><cx" ayo]a®> ax" [atyo lar” <x” |aty a : 
Lg Ss v9 — v(ccx) v’ +v(a"0°) 
This gives the scattering coefficient in the form of the effective 
area that a photon has to hit per unit solid angle of scattering. It is 
known as the Kramers-Heisenberg dispersion formula, having been first 
obtained by these authors from analogies with the classical theory 
of dispersion. 

The fact that the various terms in (82) can be combined to give 
the result (85) justifies the assumption made in deriving formula (44) 
of § 51, that the matrix elements <p’a’|V|p’«”> of the interaction 
energy are of the second order of smallness compared with the 

«’|V|k> ones, at any rate when the scattered particles are photons. 


. (85) 


a 


65. An assembly of fermions 


An assembly of fermions can be treated by a method similar to 
that used in §§ 59 and 60 for bosons. With the kets (1) we may use 
the antisymmetrizing operator A defined by 


A=w!lt}4+P, (2’) 
summed over all permutations P, the + or — sign being taken 
according to whether P is even or odd. Applied to the ket (1) it gives 

ult > 4 Plated of...08/> = AlataPat...0%, (3’) 
a ket corresponding to a state for an assembly of u’ fermions. The 
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ket (3’) is normalized provided the individual fermion kets |), |«”),... 
are all different, otherwise it is zero. In this respect the ket (3’) is 
simpler than the ket (3). However, (3’) is more complicated than (3) 
in that (3’) depends on the order in which a%, «’, a,... occur in it, 
being subject to a change of sign if an odd permutation is applied 
to this order. 

We can, as before, introduce the numbers 7,, %9, ”3,... of fermions 
in the states a, a®, a®),... and treat them as dynamical variables or 
observables. They each have as eigenvalues only 0 and 1. They form 
a complete set of commuting observables for the assembly of fermions. 
The basic kets of a representation with the n’s diagonal may be taken 
to be connected with the kets (3’) by the equation 

A |atePa’...07> = +|n, ng 723...> (6’) 
corresponding to (6), the »’’s being connected with the variables 
a*, x, a°... by equation (4). The + sign is needed in (6’) since, for 
given n’’s, the occupied states «%, a, a°,... are fixed but not their 
order, so that the sign of the left-hand side of (6’) is not fixed. To 
set up a rule which determines the sign in (6’), we must arrange all 
the states « for a fermion arbitrarily in some standard order. The 
«’s occurring in the left-hand side of (6’) form a certain selection from 
all the «’s and the standard order for all the a’s will give a standard 
order for this selection. We now make the rule that the + sign should 
occur in (6’) if the a’s on the left-hand side can be brought into their 
standard order by an even permutation and the — sign if an odd 
permutation is required. Owing to the complexity of this rule, 
the representation with the basic kets |n,n,73...> is not a very 
useful one. 

If the number of fermions in the assembly is variable, we can set 
up the complete set of kets 

I>» je), A|ata®», Alata’a®>, Oot) (9’) 
corresponding to (9). A general ket is now expressible as a sum of 
the various kets (9’). 

To continue with the development we introduce a set of linear 
operators 7, 7, one pair 7,, jj, corresponding to each fermion state a, 
satisfying the commutation relations 


Na tot Na = 9, 
fatto +o 1a = 9, (11’) 
Na Noto a = Sap: 
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These relations are like (11) with a + sign instead of a — on the left- 
hand side. They show that, for a 4 6, n, and 7, anticommute with 
np» and 7,, while, putting b = a, they give 

na = 9, ia = 9, No Tat Nea = 1- (11”) 
To verify that the relations (11’) are consistent, we note that linear 
operators 7, 7 satisfying the conditions (11’) can be constructed in 
the following way. For each state «* we take a set of linear operators 
Sxq1 Syq> Fzq like the o,, o,, o, introduced in § 37 to describe the spin 
of an electron and such that o,,, 0), 0, commute with o,,, oyp, o 
for b 4a. We also take an independent set of linear operators €,, 
one for each state «*, which all anticommute with one another and 
have their squares unity, and commute with all the o variables. 
Then, putting 

a fea Tea toga) Na = SOG ca Oye) 
we have all the conditions (11’) satisfied. 
From (11”) 
(Na ta)” = Nata Nata = Nall—Na ta) a = Na Ta 

This is an algebraic equation for n,7,, showing that 7,7, is an 
observable with the eigenvalues 0 and 1. Also 7,7, commutes with 
No tj» for b 4a. These results allow us to put 


Na Na = Na (12’) 
the same as (12). From (11”) we get now 
Nana = 1—n,, (13’) 


the equation corresponding to (13). 
Let us write the normalized ket which is an eigenket of all the n’s 
belonging to the eigenvalues zero as |0>. Then 


N,|9> = 0, 
so from (12’) nei 0. 
Hence qao> — 0, (15’) 


like (15). Again 
<O|%a Nal9 a <0|(1—n,)|0> ='<0/0> = 1, 
showing that 7,,|0> is normalized, and 
Ne Nal9> — Na Re Nal9> — | neo > — Hal, 
showing that 7,,|\0> is an eigenket of n, belonging to the eigenvalue 


unity. It is an eigenket of the other n’s belonging to the eigenvalues 
zero, since the other n’s commute with y,. By generalizing the 
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argument we see that 7, 7» 7-..7,|0) is normalized and is a simul- 
taneous eigenket of all the n’s, belonging to the eigenvalues unity 
for nq, Np, Np... M, and zero for the other ’s. This enables us to put 

A|ataPa®...o%) = no 1b NeNyg|0), (17’) 
both sides being antisymmetrical in the labels a, b, c,...,g. We have 
here the analogue of (17). The 7’s appear as creation operators for 
a fermion and the 74’s as annihilation operators. 

If we pass over to a different set of basic kets |84) for a fermion, 
we can introduce a new set of linear operators 7, corresponding to 
them. We then find, by the same argument as in the case of bosons, 
that the new »’s are connected with the original ones by (21). This 
shows that there is a procedure of second quantization for fermions, 
similar to that for bosons, with the only difference that the commu- 
tation relations (11) must be employed for fermions to replace the 
commutation relations (11) for bosons. 

A symmetrical linear operator Uz of the form (22) can be expressed 
in terms of the y, 7 variables by a method similar to that used for 
bosons. Equation (24) still holds, and so does (25) with S replaced 
by A. Instead of (26) we now have 


Ur Na Nr |0) = 2 2 ae ag Ne, 1 aL 22 U |z,) 
= x Na > (—)P ng} Nes Nas 10)842,(a\U |b), 26’) 


7nz,' meaning that the factor 7,, must be cancelled out, without its 
position among the other 7,’s being changed before the cancellation. 
Instead of (27) we have 


Ne Nar Ne |0) = > (—)nF Ney Naor" 1085.2, (27’) 
T 


so (28) holds unchanged and thus (29) holds unchanged. We have 
the same final form (29) for U; in the fermion case as in the boson 
case. Similarly, a symmetrical linear operator V, of the form (30) can 


be expressed as _ Soe a np ab|V \ed) ia Fes (35’) 


the same as one of the ways of writing (35). 

The foregoing work shows that there is a deep-seated analogy 
between the theory of fermions and that of bosons, only slight 
changes having to be made in the general equations of the formalism 
when one passes from one to the other. 
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‘here is, however, a development of the theory of fermions that 
has no analogue for bosons. For fermions there are only the two 
alternatives of a state being occupied or unoccupied and there is 
symmetry between these two alternatives. One can demonstrate the 
symmetry mathematically by making a transformation which inter- 
changes the concepts of ‘occupied’ and ‘unoccupied’, namely 


1a =%e Ta =e 

my = Na fa = 1-14. 
The creation operators of the unstarred variables are the annihilation 
operators of the starred variables, and vice versa. Thestarred variables 
are now seen to satisfy the same quantum conditions and to have all 
the same properties as the unstarred ones. 

If there are only a few unoccupied states, a convenient standard 

ket to work with would be the one for which every state is occupied, 
namely |0*)> satisfying 


N,|0*> = |O0*>. 
It thus satisfies Te" > —"0, 
or ae|0*) == 0. 


Other states for the assembly will now be represented by 


1a 1 Ne-+|0*), 
in which variables appear referring to the unoccupied fermion states 
a, b,c... We may look upon these unoccupied fermion states as holes 
among the occupied ones and the 7* variables as the operators of 
creation of such holes. The holes are just as much physical things 
as the original particles and are also fermions. 


XI 
RELATIVISTIC THEORY OF THE ELECTRON 


66. Relativistic treatment of a particle 

Tue theory we have been building up so far is essentially a non- 
relativistic one. We have been working all the time with one par- 
ticular Lorentz frame of reference and have set up the theory as an 
analogue of the classical non-relativistic dynamics. Let us now try 
to make the theory invariant under Lorentz transformations, so that 
it conforms to the special principle of relativity. This is necessary in 
order that the theory may apply to high-speed particles. There is no 
need to make the theory conform to general relativity, since general 
relativity is required only when one is dealing with gravitation, and 
gravitational forces are quite unimportant in atomic phenomena. 

Let us see how the basic ideas of quantum theory can be adapted 
to the relativistic point of view that the four dimensions of space- 
time should be treated on the same footing. The general principle 
of superposition of states, as given in Chapter IJ, is a relativistic 
principle, since it applies to ‘states’ with the relativistic space-time 
meaning. However, the general concept of an observable does not fit 
in, since an observable may involve physical things at widely separated 
points at one instant of time. In consequence, if one works with a 
general representation referring to any complete set of commuting 
observables, the theory cannot display the symmetry between space 
and time required by relativity. In relativistic quantum mechanics 
one must be content with having one representation which displays 
this symmetry. One then has the freedom to transform to another 
representation referring to a special Lorentz frame of reference if it 
is useful for a particular calculation. 

For the problem of a single particle, in order to display the sym- 
metry between space and time we must use the Schrédinger repre- 
sentation. Let us put 2,, 2, %3 for x, y, z, and x, for ct. The time- 
dependent wave function then appears as (v9 1, X23) and provides 
us with a basis for treating the four x’s on the same footing. 

We shall use relativistic notation, writing the four z’s as x, 
(u = 0, 1, 2,3). Any space-time vector with four components which 
transform under Lorentz transformations like the four elements dz, 
will be written like a, with a lower Greek suffix. We may raise the 
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suffix according to the rules 
a® =a), a! = —a,, a? = —a,, a? = —asy. (1) 
The a,, are called the contravariant components of the vector a, and 
the a the covariant components. Two vectors a, and 6, have a 
Lorentz-invariant scalar product 
Ay by — a 6 — Gg b,—Ag bg = arb, = a, br, 
a summation being implied over a repeated letter suffix. The funda- 
mental tensor g#” is defined by 
g® = 1, gi = g2 = g3 = —1, 
(2) 
ga 0 forges: 
With its help the rules (1) connecting covariant and contravariant 
components may be written 
ae = gta, 

In the Schrédinger representation the momentum, whose com- 
ponents will now be written p,, 2, p3 instead of p,, p,, p,, is equal 
to the operator 

Pp = —hodldx, (r = 1, 2, 3). (3) 
Now the four operators @/éx, form the covariant components of a 
4-vector whose contravariant components are written 0/dz#. So to 
bring (3) into a relativistic theory, we must first write it with its 
suffixes balanced, 2, = ih a/ont, 


and then extend it to the complete 4-vector equation 
Py = holdout. (4) 

We thus have to introduce a new dynamical variable p,, equal to 
the operator ih 8/0. Since it forms a 4-vector when combined with the 
momenta p,, it must have the physical meaning of the energy of the 
particle divided by c. We can proceed to develop the theory treating 
the four p’s on the same footing, like the four 2’s. 

In the theory of the electron that will be developed here we shall 
have to introduce a further degree of freedom describing an internal 
motion of the electron. The wave function will thus have to involve 
a further variable besides the four x’s. 


67. The wave equation for the electron 


Let us consider first the case of the motion of an electron in the 
‘ absence of an electromagnetic field, so that the problem is simply 
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that of the free particle, as dealt with in § 30, with the possible 
addition of internal degrees of freedom. The relativistic Hamiltonian 
provided by classical mechanics for this system is given by equation 
(23) of § 30, and leads to the wave equation 

{Po— (m*c? + pi+ pi + pa) = 0, (5) 
where the p’s are interpreted as operators in accordance with 
(4). Equation (5), although it takes into account the relation between 
energy and momentum required by relativity, is yet unsatisfactory 
from the point of view of relativistic theory, because it is very un- 
symmetrical between p, and the other p’s, so much so that one cannot 
generalize it in a relativistic way to the case when there is a field 
present. We must therefore look for a new wave equation. 

If we multiply the wave equation (5) on the left by the operator 
{po+(mc?+ pt+ p+ p2)}, we obtain the equation 

{pp —m*c?—pi—pi—pa}h = 0, (6) 
which is of a relativistically invariant form and may therefore more 
conveniently be taken as the basis of a relativistic theory. Equation 
(6) is not completely equivalent to equation (5) since, although every 
solution of (5) is also a solution of (6), the converse is not true. Only 
those solutions of (6) belonging to positive values for py are also 
solutions of (5). 

The wave equation (6) is not of the form required by the general 
laws of the quantum theory on account of its being quadratic in pp. 
In § 27 we deduced from quite general arguments that the wave 
equation must be linear in the operator @/ét or po, like equation (7) 
of that section. We therefore seek a wave equation that is linear 
in p, and that is roughly equivalent to (6). In order that this wave 
equation shall transform in a simple way under a Lorentz transforma- 
tion, we try to arrange that it shall be rational and linear in p,, pp, 
and p, as well as in po, and thus of the form 


{Po—% Pi— % Po — 3 Pp— PB} = 0,7 (7) 
where the «’s and f are independent of the p’s. Since we are consider- 
ing the case of no field, all points in space-time must be equivalent, 
so that the operator in the wave equation must not involve the x’s. 
Thus the «’s and £ must also be independent of the z’s, so that they 
must commute with the p’s and the x’s. They therefore describe 
some new degree of freedom, belonging to some internal motion in 
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the electron. We shall see later that they bring in the spin of the 
electron. 

Multiplying (7) by the operator {p9-+-0 p\+ «2 Po+ 3 P3+f} on the 
left, we obtain 


Vie Pe es Dit (oy Xt Oy %) Pr Pot (o%y B-+ Pou) p.]—B?| al 
where > refers to cyclic permutations of the suffixes 1, 2, 3. This is 


123 
the same as (6) if the «’s and f satisfy the relations 


ar "I, Oly p+ Oy a, = 0, 

B? = m?c?, a,B+Ba, = 0, 
together with the relations obtained from these by permuting the 
suffixes 1, 2, 3. If we write 

B= cagne, 
these relations may be summed up in the single one, 
OX eto, %X = 28,, (a,b = 1, 2, 3, or m). (2: 

The four «’s all anticommute with one another and the square ot 
each is unity. 

Thus by giving suitable properties to the a’s and 8 we can make 
the wave equation (7) equivalent to (6), in so far as the motion of 
the electron as a whole is concerned. We may now assume (7) is the 
correct relativistic wave equation for the motion of an electron in- 
the absence of a field. This gives rise to one difficulty, however, 
owing to the fact that (7), like (6), is not exactly equivalent to (5), 
but allows solutions corresponding to negative as well as positive 
values of 9. The former do not, of course, correspond to any actually 
observable motion of an electron. For the present we shall consider 
only the positive-energy solutions and shall leave the discussion of 
the negative-energy ones to § 73. ; 

We can easily obtain a representation of the four «’s. They have 
similar algebraic properties to the o’s introduced in § 37, which o’s 
can be represented by matrices with two rows and columns. So long 
as we keep to matrices with two rows and columns we cannot get a 
representation of more than three anticommuting quantities, and we 
have to go to four rows and columns to get a representation of the 
four anticommuting «’s. It is convenient first to express the «’s in 
terms of the o’s and also of a second similar set of three anticom- 
muting variables whose squares are unity, p,, pe, ps say, that are 
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independent of and commute with the o’s. We may take, amongst 
other possibilities, 

op Peet» ay Pua, 3)" Paes» mn — Pa» (9) 
and the «’s will then satisfy all the relations (8), as may easily be 


verified. If we now take a representation with p, and o, diagonal, 
we shall get the following scheme of matrices: 


o,= /0 1 0 O\ a= /0-%1 0 0\ op =/1 0 0 0 
020 @ i 00 0 i! oO 6 
of Om @ 0m 04 Oe a0: a. 0 
yO @ 00% 0 © b oy 1 
pp = f@ OF 1 O\ py—fO Ot 0\ pp= /h 0 0 0 
0 0-071 OO Ge: - 1 00 
x om @ i 0 OF 0 » 06-1 OF 
e 140 Om 0F 0 o & 0 =). 


It should be noted that the p’s and o’s are all Hermitian, which makes 
the «’s also Hermitian. 

Corresponding to the four rows and columns, the wave function ¢ 
must contain a variable that takes on four values, in order that the 
matrices shall be capable of being multiplied into it. Alternatively, 
we may look upon the wave function as having four components, each 
a function only of the four z’s. We saw in § 37 that the spin of the 
electron requires the wave function to have two components. The 
fact that our present theory gives four is due to our wave equation 
(7) having twice as many solutions as it ought to have, half of them 
corresponding to states of negative energy. 

With the help of (9), the wave equation (7) may be written with 
three-dimensional vector notation 


{po— pil, P)— ps mes = 0. (10) 
To generalize this equation to the case when there is an electro- 
magnetic field present, we follow the classical rule of replacing py and 
p by ppte/e.A, and p+e/c.A, Ay and A being the scalar and vector 
potentials of the field at the place where the electron is. This gives 
us the equation 


{p,+£Ao—ni(a, +54} —ramefh = 0, (11) 


which is the fundamental wave equation of the relativistic theory of 
the electron. 
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The four components of ¢ in (10) or (11) should be pictured as writter 
one below another, so as to form a single-column matrix. The squari 
matrices p and o then get multiplied into the single-column matrix x 
according to matrix multiplication, the product being in each case 
another single-column matrix. The conjugate imaginary wave func- 
tion that represents a bra should be pictured as having its four com- 
ponents written one beside another, so as to form a single-row matrix, 
which can be multiplied from the right by a square matrix p or a to give 
another single-row matrix. We denote this conjugate imaginary wave 
function pictured as a single-row matrix by ¢', using the symbol * to 
denote the transpose of any matrix, i.e. the result of interchanging 
the rows and columns, Then the conjugate imaginary of equation (11) 


reads 7 é , 
B{p,+£ Ayal. P+ 4\— pame| = 0, (12) 


in which the operators p operate to the left. An operator of differentia- 
tion operating to the left must be interpreted according to (24) of § 22. 


68. Invariance under a Lorentz transformation 
Before proceeding to discuss the physical consequences of the wave - 
equation (11) or (12), we shall first verify that our theory really is 
invariant under a Lorentz transformation, or, stated more accurately, 
that the physical results the theory leads to are independent of the 
Lorentz frame of reference used. This is not by any means obvious 
from the form of the wave equation (11). We have to verify that, if 
we write down the wave equation in a different Lorentz frame, the 
solutions of the new wave equation may be put into one-one corre- 
spondence with those of the original one in such a way that corre- 
sponding solutions may be assumed to represent the same state. For 
either Lorentz frame, the square of the modulus of the wave function, 
summed over the four components, should give the probability per 
unit volume of the electron being at a certain place in that Lorentz 
frame. We may call this the probability density. Its values, calculated 
in different Lorentz frames for wave functions representing the same 
state, should be connected like the time components in these frames 
.of some 4-vector. Further, the 4-dimensional divergence of this 4- 
vector should vanish, signifying conservation of the electron, or that 
the electron cannot appear or disappear in any volume without passing 
through the boundary. 
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For brevity it is convenient to introduce the symbol a, = 1 and to 
suppose that the suffixes of the four a, (4 = 0, 1, 2, 3) can be raised 
in accordance with the rules (1), even though these four «’s do not 
form the components of a 4-vector. We can now write the wave 
equation (11) {aH(p, +e/¢.A,)—x,, meh = 9. (13) 
The four o# satisfy 

ake, a+ ava, at == 2gt¥a,, (14) 
with gt” defined by (2), as one can verify by taking separately the cases 
when p» and v are both 0, when one of them is 0, and when neither of 
them is 0. 

Let us apply an infinitesimal Lorentz transformation and distinguish 
quantities referring to the new frame of reference by a star. The com- 
ponents of the 4-vector p, will transform according to equations of 


tl 
re type px = p,+4,"p,, (15) 


where the a,” are small numbers of the first order. We shall neglect 
quantities that are quadratic in the a’s and thus of the second order. 
The condition for a Lorentz transformation is that 

Per =p, p*, 
which gives a,,”p, p* +p, a’p, = 9, 
leading to avy tae = 0. (16) 
The components of A if will transform according to the same law, so 


we have 
Pytele.A, = Pp paneer, =a," (p*+e/c. A). 


Thus the wave equation (13) becomes 
{(at—oaye)(pi+e/c. At) —a,, me} = 0. (i) 
Define M = 4a, Po, 07 (18) 
Then from (14) 
Mo, M— Moe, of = fas g{ (aM Op, HP OP Oy OH )Oyy OX? — 
— Pty (Ot! Oy HF + 7% OL, cx) } 
= 3Apo(grPa? — ag?) 
= = afte? 
with the help of (16), and hence 
at(ltoan,M) = (1+Ma,,)(o4—a,HoP). (19) 
Thus, multiplying (17) by (1+Mza,,) on the left, we get 
fot(1+a,, M)(pii+-e/e. At) — (om + M )me}ys a0. 
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So if we put (1-0, M)pb = #*, (20) 
we get {aH(pite/c.Az)—a, mepp* = 0. (21) 


This is of the same form as (13) with the starred variables pi, AZ, $*, 
and shows that (13) is invariant under an infinitesimal Lorentz trans- 
formation, provided % is subjected to the right transformation, given 
by (20). A finite Lorentz transformation can be built up from infinite- 
simal ones, so under a finite Lorentz transformation the wave equation — 
(13) isalsoinvariant. Note that the matrices « do not get altered at all. 

The invariance proved above means that the solutions % of the 
original wave equation (13) are in one-one correspondence with the 
solutions 4:* of the new wave equation (21), corresponding solutions 
being connected by (20). We assume that corresponding solutions 
represent the same physical state. We must now verify that the 
physical interpretations of corresponding solutions, referred to their 
respective Lorentz frames of reference, are in agreement. This requires 
that 4% should give the probability density referred to the original 
frame and }*'* the probability density referred to the new frame. 
Let us examine the relationship between these quantities. ty is the 
same as %'x% and forms one of the four quantities {ta“y, which should 
be treated together. 

Equations (18) and (16) show that M is pure imaginary. Thus the 
conjugate imaginary of equation (20) is 

pet — = pr(l— M«,,) 
PManp® = $'(1—Moy,)ot(1 +o%q My 
= pi(1 =Ma,,)(1 + M a,)(o —a, ho” yb 
from (19). This reduces to 
Prony = "(at —a, ta" yh 
= Plamptam, prop 

with the help of (16). If we lower the suffix uw here, we get an equation 
of the same form as (15), which shows that the four quantities pra, ab 
transform like the contravariant components of a 4-vector. Thus dty 
transforms like the time component of a 4-vector, which is the correct 
transformation law for a probability density. The space components 
of the 4-vector, namely jw, , if multiplied by c, give the probability 


current, or the probability of the electron crossing unit area per 
unit time. 


Hence 
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It should be noted that *u,, is invariant, since 


Pom pr a pr he, oes, (1 =a Myx 


= Prom wb. 
We must verify finally the conservation law, that the divergence 
on ine ’ 
ae, C ar #) (22) 


vanishes. To prove this, multiply equation (13) by Jt on the left. 
The result is 


Brain sia eA At) ee 0. 
The conjugate imaginary equation is 
(—a cl C+ Pte A, ) ah — Pq mes mai 
Subtracting and dividing by ih, we get 
Brow Oy OF cay = 0, 
which just expresses the inadiliees of (22). In this way we complete 


the proof that our theory gives consistent results in whichever frame 
of reference it is applied. 


69. The motion of a free electron 

It is of interest to consider the motion of a free electron in the 
Heisenberg picture according to the above theory and to study the 
Heisenberg equations of motion. These equations of motion can be 
integrated exactly, as was first done by Schrédinger.t For brevity 
we shall omit the suffix ¢ which the notation of § 28 requires to be 
inserted in dynamical variables that vary with time in the Heisen- 
berg picture. 

As Hamiltonian we must take the expression which we get as equal 
to cp) when we put the operator on ¢ in (10) equal to zero, i.e. 


H = cp,(c, p)+p3mc® = c(a, p)+pgmc?. (23) 
We see at once that the momentum commutes with H and is thus a 
constant of the motion. Further, the x,-component of the velocity is 


This result is rather surprising, as it means an altogether different 
+ Schrodinger, Sitzungsb. d. Berlin. Akad., 1930, p. 418. 


3595.57 = 
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relation between velocity and momentum from what one has in 
classical mechanics. It is connected, however, with the expression 
‘ca, ys for a component of the probability current. The, given by (24) 
has as eigenvalues +c, corresponding to the eigenvalues +1 of a. 
As £, and #, are similar, we can conclude that a measurement of a com- 
ponent of the velocity of a free electron is certain to lead to the result +c. 
This conclusion is easily seen to hold also when there is a field present. 

Since electrons are observed in practice to have velocities con- 
siderably less than that of light, it would seem that we have here a 
contradiction with experiment. The contradiction is not real, though, 
since the theoretical velocity in the above conclusion is the velocity 
at one instant of time while observed velocities are always average 
velocities through appreciable time intervals. We shall find upon 
further examination of the equations of motion that the velocity is 
not at all constant, but oscillates rapidly about a mean value which 
agrees with the observed value. 

It may easily be verified that a measurement of a component of the 
velocity must lead to the result +c in a relativistic theory, simply 
from an elementary application of the principle of uncertainty of 
§ 24. To measure the velocity we must measure the position at two 
slightly different times and then divide the change of position by the 
time interval. (It will not do to measure the momentum and apply 
a formula, as the ordinary connexion between, velocity and momen- 

tum is not valid.) In order that our measured velocity may approxi- 
mate to the instantaneous velocity, the time interval between the 
two measurements of position must be very short and hence these 
measurements must be very accurate. The great accuracy with 
which the position of the electron is known during the time-interval 
must give rise, according to the principle of uncertainty, to an almost 
complete indeterminacy in its momentum. This means that almost 
all values of the momentum are equally probable, so that the momen- 
tum is almost certain to be infinite. An infinite value for a component 
of momentum corresponds to the value +c for the corresponding 


~ component of velocity. 


Let us now examine how the velocity of the electron varies with 
time. We have a a, 


Now since a, anticommutes with all the terms in H except ca, p,, 


oy A+ Hoy = a Coy P+ Coy py x, = 2cpy, 


§ 69 THE MOTION OF A FREE ELECTRON 263 


and hence Mie — Det —20p., 

= —2Ha,+2cp,. > 
Since H and p, are constants, it follows from the first of equations 
edie itis, = Qe H. (26) 
This differential equation in &, can be integrated immediately, the 
result being dy = a0 e—RHUR, (27) 


where «? is a constant, equal to the value of 4, when t= 0. The 
factor e~?*4#% must be put to the right of the factor «? in (27) on 
account of the H occurring to the right of the «, in (26). The second 
of equations (25) leads in the same way to the result 

dy, = ert Httin gO, 


We can now easily complete the integration of the equation of motion 
for x,. From (27) and the first of equations (25) 


a= attach ope, (28) 
and hence the time-integral of equation (24) is 
ry = —fohPale*HihY-*4-c%p, H-'t+-a,, (29) 


a, being a constant. 

From (28) we see that the x, component of velocity, ca,, consists 
of two parts, a constant part c?p, H-}, connected with the momentum 
by the classical relativistic formula, and an oscillatory part 

hichad e~%HUnFT-1, 
whose frequency is high, being 2H/h, which is at least 2mc?/h. Only 
the constant part would be observed in a practical measurement of 
velocity, such a measurement giving the average velocity through a 
time-interval much larger than h/2mc?. The oscillatory part secures 
that the instantaneous value of #, shall have the eigenvalues +c. The 
oscillatory part of x, is small, being, according to (29), 
— chad e-*tHIR-2 — fich(a,—cp,H-)H-, 

which is of the order of magnitude h/mc, since («,—cp, H-1) is of the 
order of magnitude unity. 


70. Existence of the spin 

In § 67 we saw that the correct wave equation for the electron in 
the absence of an electromagnetic field, namely equation (7) or (10), is 
equivalent to the wave equation (6) which is suggested from analogy 
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with the classical theory. This equivalence no longer holds when 
there is a field. The wave equation to be expected from analogy with 
the classical theory in this case is 


{(Po+5-40) (p+fa) mecrly = 0, (30) 


in which the operator is just the classical relativistic Hamiltonian. If 
we multiply (11) by some factor on the left to make it resemble 
(30) as closely as possible, namely the factor 


e 
Pot sat i(2, p +5A)+eame 
we get 


bay © a) mre? CA a 
{(o-+5 5] = a J’ —mte?—p,| (Pott 0 arpa = 


—(e, p-+£A}(p+540)||¥ =0. (31) 


We now use the general formula that, if B and C are any two 
three-dimensional vectors that commute with a, 
(o, B)(o, C) = 2, (a B,C, +0, 02 B,C, +020, B,C}, 
the summation referring to cyclic permutations of the suffixes 1, 2, 3, 
oF (o, B)(o, C) = (B, C)i4 > rshtts C,— B,C,) 
= (B,C)+i(6,BxC). (32) 
Taking B = C = p-+e/c.A, we find, since 


(p+éa) x (p+A) = “(px A+AXx p} 


= —the/c.curlA = —ihe/c.A, 
where # is the magnetic field, that . 


2 he 


(2. p+5A) = (p+A) +™ (6,4), (33) 


Also we have 
e e e€ 
(ro +540)(2, p+sA)— («. p+5A)(po+<40] 


2 


= Fm Po A—Apy+ Ay P— pA,) 


: .| 10A he 
c 


Snare +gradA,) = aS! (a, &), 
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where € is the electric field. Thus (31) becomes 


" ; 

{(Po+ £0] (p+oA) —met— (0M) bipy es (2, €)| =o. 

(34) 
This equation differs from (30) through having two extra terms in 
the operator. These extra terms involve some new physical effects, 
but since they are not real they do not lend themselves very directly 
to physical interpretation. 

To get an understanding of the physical features involved in the 
difference between (34) and (30) it is better to work with the Heisen- 
berg picture, this picture being always the more suitable one for 
comparisons between classical and quantum mechanics. The Heisen- 
berg equations of motion are determined by the Hamiltonian 


i= —eAytopi(2, p+ 5A) + pamet (35) 


the generalization of (23) to the case when there is a field. Equation 


(35) gives 
2 2 
(5 +549} = [a(e. p-+5A)+pame| 


e 
é 2 
= (. p+sA) + mc? 


2 (pfa) +-mter 4 (02) (36) 


with the help of (33). We have here the real part of the extra terms 
in (34) appearing without the pure imaginary part. For an electron 
moving slowly (i.e. with small momentum), we may expect the 
Heisenberg equations of motion to be determined by a Hamiltonian 
of the form mc?-+- H,, where H, is small compared with mc?. Putting 
me?+-H, for H in (36) and neglecting H? and other terms involving 
c-?, we get, on dividing by 2m, 
he 


2 
H,+eAy = aa(P+54) co eel (37) 


The Hamiltonian H, given by (37) is the same as the classical 
Hamiltonian for a slow electron, except for the last term 
Jf (6,3). 
2mc 


This term may be considered as an additional potential energy 
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which a slow electron has in the quantum theory and may be 
interpreted as arising from the electron having a magnetic moment 
—he/2mc.o. This magnetic moment is the one assumed in §§ 41 and 
47 for dealing with the Zeeman effect and is in agreement with 
experiment. 

The spin angular momentum does not give rise to any potential | 
energy and therefore does not appear in the result of the preceding 
calculation. The simplest way of showing the existence of the spin 
angular momentum is to take the case of the motion of a free electron 
or an electron in a central field of force and determine the angular 
momentum integrals. This means working with the Hamiltonian (23), 
or with the Hamiltonian (35) with A = 0 and A, a function of the 
radius 1, i.e. H = —cA,(r)+ep,(e, p)+pgme?, (38) 
and obtaining the Heisenberg equations of motion for the angular 
momentum. With either Hamiltonian we find for the rate of change 
of the x,-component of orbital angular momentum, m, = %2P3—3 Po, 
with the help of commutation relations proved in § 35, 


ihm, — mM, H—Hm, 
= Cp,{m,(o, p)—(e, p)m} 
= Cp,(6,m, P— pm) 
= thep,{o2P3— 3 Po}. 
Thus 7, 4 0 and the orbital angular momentum is not a constant 
of the motion. This result is to be expected from the integrated 
equation of motion (29), the oscillatory part of the motion here dis- 


played giving rise to an oscillatory term in the angular momentum. 
We have further 


the, = 0, H—Ho, 

= Cp{o,(6, P)—(a, P)oy} ; 

= Cp;(o, 6—G0,, Pp) 

= 2icp,{o3 Po— 2 Ps} 
with the help of equations (51) of § 37. Hence 

m,+4he, = 0, 

so that the vector m-+43fe is a constant of the motion. This result 
one can interpret by saying the electron has a spin angular momentum 


3he, which must be added to the orbital angular momentum m before 
one gets a constant of the motion. The spin angular momentum 
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could alternatively be obtained from the rotation operators for states 
of spin in accordance with the general method of § 35. 

The same vector o fixes the directions of both the spin magnetic 
moment and the spin angular momentum. If an electron in a certain 
state of spin has a spin angular momentum of 43/ in a particular 
direction, it will have a magnetic moment —eh/2me in the same 
direction. 

We were led to the value 34 for the spin of the electron by an 
argument depending simply on general principles of quantum theory 
and relativity. One could apply the same argument to other kinds 
of elementary particle and one would be led to the sarne conclusion, 
that the spin angular momentum is half a quantum. This would be 
satisfactory for the proton and the neutron, but there are some kinds 
of elementary particle (e.g. the photon and certain kinds of meson) 
whose spins are known experimentally to be different from $f, so we 
have a discrepancy between our theory and experiment. 

The answer is to be found in a hidden assumption in our work. 
Our argument is valid only provided the position of the particle is 
an observable. If this assumption holds, the particle must have a 
spin angular momentum of half a quantum. For those particles that 
have a different spin the assumption must be false and any dynamical 
variables z,, 22, X3 that may be introduced to describe the position 
of the particle cannot be observables in accordance with our general 
theory. For such particles there is no true Schrodinger representation. 
One might be able to introduce a quasi wave function involving the 
dynamical variables 2, £2, %g, but it would not have the correct 
physical interpretation of a wave function—that the square of its 
modulus gives the probability density. For such particles there is still 
a momentum representation, which is sufficient for practical purposes. 


71. Transition to polar variables 

For the further study of the motion of an electron in a central field 
of force with the Hamiltonian (38), it is convenient to make a 
transformation to polar coordinates, as was done in § 38 in the 
non-relativistic case. We can introduce r and p, as before, but 
instead of k, the magnitude of the orbital angular momentum m, 
which is no longer a constant of the motion, we must now use the 
magnitude of the total angular momentum M = m+ jho. Let us put 


ph? = M3+Mi+ M3+4h. ai 
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The eigenvalues of m, are integral multiples of %, those of }%e, are 
+4h, and hence those of M, must be half odd integral multiples of 
i. It follows from the theory of § 36 that the eigenvalues of |j| must 
be integers greater than zero. 

If in formula (32) we take B = C = m, we get 


(o,m)? = m?+7(6,m x m) 
= m’?—A(o, m) 
= (m-+}he)*— 2h(o, m)—3h?. 
Hence {(o,m)+%}? = M?+1%?, 
Thus (6, m)-+4 is a quantity whose square is M2+ 1%? and we could, 
consistently with equation (39), define j4 as (6,m)+%. This would 
not be the most convenient definition for j7, however, since we would 


like to have 7 a constant of the motion and (6, m)-+4% is not constant. 
We have, in fact, from applications of (32), 


(o,m)(o, p) = i(6,m x p) 
and (o, p)(o,m) = i(6, px m), 
so that 
(o,m)(o, p)+(6, p)(6,m) = t Doss Ps— Mg Pot Pa Mz—Pz Mp} 


= 1) o,.2ihtp, = —2K(e, p), 
or {(6,m)+4}(, p)+(e, p){(e,m)+h} = 0. 


Thus (6,m)-+-# anticommutes with one of the terms in the expression 
(38) for H, namely the term cp,(6, p), and commutes with the other 
two. It follows that p,{(¢,m)-+#} commutes with all the three terms 
in H and is a constant of the motion. But the square of ps{(¢, m)+4} 
is also M?+-3%?. We can therefore take 


jh = psi(o,m)+4}, _ (40) 
which gives us a convenient rational definition for 7 which is consis- 
tent with (39) and makes j a constant of the motion. The eigenvalues 
of this j are all positive and negative integers, excluding zero. 


By a further application of (32), we get 
(6, X)(6, P) = (x, p)+72(6, m) 
= rp,+ipsjh—ih, (41) 
with the help of (40) and also of equation (58) of §38. We introduce 
the linear operator « defined by 


re = p,(o, x). (42) 
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Since r commutes with p, and with (o, x), it must commute with e. 
We thus have 

ret = [py(0, x)? = (a,x)? = x? = 19, 


or es |], 


Now p,(o, p) commutes with j, and since there is symmetry between 
x and p so far as angular momentum is concerned, p,(o, x) must also 
commute with ). Hence « commutes with j. Further, «e must commute 
with p,, since we have 


(6, X)(X, P)—(X, P)(o, X) = (¢, X(X, P)—(X, P)x) = th(a, x), 
which gives rerp,—?rp,re = thre, 
or rep, —r?p,e == 0. 
From (41) and (42) we obtain 
T€py(O,P) = Tp, +tpgjh—ih, 
or pi(s, Pp) = €(p,—th/r)+tepgjh/r. 
Thus (38) becomes 
H!c = —e/c. Ay+e(p,—th/r)+tepgjh/r+ pz me. 


This gives our Hamiltonian expressed in terms of polar variables. It 
should be noticed that « and p, commute with all the other variables 
occurring in H and anticommute with one another. This means that 
we can take a representation with p, diagonal in which ¢« and p, are 
represented respectively by the matrices 


4 7 (; a a?) 


If r is also diagonal in the representation, the representative 
<r'paly of a ket will have two components, ¢r’,1|> = #,(r’) and 
<r’, —1|> = y(r’) say, referring to the two rows and columns of the 
matrices (43). 


72. The fine-structure of the energy-levels of hydrogen 

We shall now take the case of the hydrogen atom, for which Ay = e/r, 
and work out its energy-levels, given by the eigenvalues H’ of H. 
The equation (H’—H)|7 = 9 which defines these eigenvalues, when 
written in terms of representatives in the representation discussed 
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above with « and pg, represented by the matrices (43), gives the 
equations 


(F+5]o+ a+ Woe Sy meh = 0 
(F+ "|i a(S + Watt ha tmoy, = 0. 


h h 
Tiana mit " jee = = (=) 
these equations reduce to 
l ao g+1 
(=~) bo-(F5 au an te = 0, 
(45) 


stj-(E pm 


where « = e2/fic, which is a small number. We shall solve these equa- 
tions by a similar method to that used for equation (73) in § 39. 


Put hy fey, leat eet gs (46) 
introducing two new functions, f and g, of r, where 
a= (a,a,)* = htm'’c?— H'*/e?)-*. (47) 


Equations (45) become 


l1 «a a 
eae ae 


1 «a oa ee} = 
ect a | ian 


We now try for a solution in which f and g are in the form of power 
series, fs = 3 c,r8, ge p ce 78, (49) 


in which consecutive values of s differ by unity though these values 
need not be integers. Substituting these expressions for f and g in 
(48) and picking out coefficients of r*-1, we obtain 
Cy-3/A,—a0,— (8+J)C, +, 3/4 = 0, 
C5-4/Aq-a0,—(8—J)C,+C,_,/a = 0. 
By multiplying the first of these equations by a and the second by 
a, and subtracting, we eliminate both c,_, and c,_,, sinee from 
(47) a/a, = a,/a. We are left with 


[da—ag(s—J)]e,+[a,%+a(s+J) le, = 0, (51) 


(48) 


(50) 
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a relation which shows the connexion between the primed and un- 
primed c’s, 

The boundary condition at r = 0 requires that ms, and rj, > 0 as 
ry —> 0, so from (46) f and g > 0 as 7+ -> 0. Thus the series (49) must 
terminate on the side of small s. If sy is the minimum value of s for 
which ¢, and c, do not both vanish, we obtain from (50), by putting 
& = spelt, .4 = Ga, — "0; 

Cs, + (Sot+j Jes, = 9, 
aC, —(8>—J )Cz, = 0, 
which give a? = —s2+7?, 


(52) 


Since the boundary condition requires that the minimum value of s 
shall be greater than zero, we must take 


8) = +A/(j?—2%). 
To investigate the convergence of the series (49) we shall determine 


the ratio c,/c,_, for large s. Equation (51) and the second of equations 
(50) give approximately, when s is large, 


Cgx.— ac, 
and SC, = C,_-1/4+C4_1/dp. 
Hence Og/O,-, == 2fac 


The series (49) will therefore converge like 
1 jae 

Dale): 

& 
or e242, This result is similar to that obtained in §39 and allows us 
to infer, as in §39, that all values of H’ are permissible for which a 
is pure imaginary, i.e. from (47), for which H’ > mc, while for 
H’ < mc? we take a to be positive and then find that only those 
values of H’ are permissible for which the series (49) terminate on 
the side of large s. 


If the series (49) terminate with the terms c, and c,, so that 
Cyi1 = Cory = 0, we obtain from (50) with s+1 substituted for s 


c,/a,+¢,/@ aa 0, 
c,/a,+c,/a = 9. 
These two equations are equivalent on account of (47). When com- 


bined with (51), they give 
a,[ax—Ae(8—J)] = afa,a+a(s+J)], 


(53) 
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which reduces to 2a, 4,5 = a(a,—a,)a, 
ae oe Bee H ~ 
a@ Wa, a ch” 


with the help of (44). Squaring and using (47), we obtain 
s?(m*c?— H'?/c?) = of? H'*/c?. 
fal! o2\ 4 
Hence ae ( +3) : 
The s here, which specifies the last term in the series, must be greater 
than s, by some integer not less than zero. Calling this integer 1, 


we have 8 = n+a(j2—02) 

He a2 4 

—_ . 54 
and thus — fears | (54) 


This formula gives the discrete energy-levels of the hydrogen 
spectrum and was first obtained by Sommerfeld working with Bohr’s 
orbit theory. There are two quantum numbers n and j involved, but 
owing to «? being very small the energy depends almost entirely on 
n+ |j|. Values of n and |7| that give the same n+ || give rise to a 
set of energy-levels lying very close to one another, and to the 
energy-level given by the non-relativistic formula (80) of § 39 with. 
8 =n-+|4|, apart from the constant term mc?. 

We used equations (53) by combining them with (51), but this does 
not make full use of (53) since the coefficients of c, and c, in (51) may 
both vanish. In this case we get, multiplying the first coefficient by 
a, and the second by a and adding, 


a(a,+d,)ux-+ 2a,a.7 = 0. 
Thus j must be negative in this case. With the help of (44) and (47) 


we get further . 
4 @ a 2mea 2mc 
a Wa, a, h (m?c?— H'2/c?)}’ 
ee iT. aa 
m2c4 - 


Since H’ must be positive, this leads to 
’ ee 
= (55) 


me ijl 


which is the value of H’ given by (54) when n = 0. The case n = 0 
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with j negative thus needs further investigation to see whether the 
conditions (53) are then fulfilled. 

With n = 0, the maximum value of s is the same as the minimum, 
so equations (53) with s, substituted for s should agree with (52). 
Now (55) gives, from (44) and (47), 


a = (1) ee 2 
a oh Wl Pah Ag 
so the first of equations (53) with s, sub.tituted for s gives 

CeAlJI—V (P?—o*)}+6,, 0 = 0. 
This agrees with the second of equations (52) only if 7 is positive. 
We can conclude that, for n = 0, 7 must be a positive integer, while 
for the other values of all non-zero integral values of j are allowed. 


73. Theory of the positron 


It has been mentioned in § 67 that the wave equation for the elec- 
tron admits of twice as many solutions as it ought to, half of them 
referring to states with negative values for the kinetic energy cpg+eAp. 
This difficulty was introduced as soon as we passed from equation (5) 
to equation (6) and is inherent in any relativistic theory. It occurs 
also in classical relativistic theory, but is not then serious since, owing 
to the continuity in the variation of all classical dynamical variables, 
if the kinetic energy cp)+eA, is initially positive (when it must be 
greater than or equal to mec’), it cannot subsequently be negative 
(when it would have to be less than or equal to —mece?). In the 
quantum theory, however, discontinuous transitions may take place, 
so that if the electron is initially in a state of positive kinetic energy 
it may make a transition to a state of negative kinetic energy. It is 
therefore no longer permissible simply to ignore the negative-energy 
states, as one can do in the classical theory. 

Let us examine the negative-energy solutions of the equation 


(04+ £6) —aa(Ps-+ $s) — 
~as(py-+ £44) —o4{Ps +5 4s) —amme z=. (56) 


a little more closely. For this purpose it is convenient to use a repre- 
sentation of the «’s in which all the elements of the matrices repre- 
senting «,, %,, and w; are real and all those of the matrix representing 
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«,, are pure imaginary or zero. Sucha representation may be obtained, 
for instance, from that of § 67 by interchanging the expressions for a, 
and a,, in (9). If equation (56) is expressed as a matrix equation in 
this representation and we put —? for / all through it, we get, remem- 
bering the 7 in (4), 


{(—2o-+£-4s)—os{ —Pi + fa) — 


—ao{—Py-+£4y} ool —P5 +244} + melt = 0. (57) 


Thus each solution % of the wave equation (56) has for its conjugate 
complex a solution of the wave equation (57). Further, if the solution 
# of (56) belongs to a negative value for cp)-++-eAy, the corresponding 
solution ¢& of (57) will belong to a positive value for cpyp—ed. But the 
operator in (57) is just what one would get if one substituted —e for e 
in the operator in (56). It follows that each negative-energy solution 
of (56) is the conjugate complex of a positive-energy solution of the 
wave equation obtained from (56) by substitution of —e for e, which 
solution represents an electron of charge +e (instead of —e, as we 
had up to the present) moving through the given electromagnetic field. 
Thus the unwanted solutions of (56) are connected with the motion 
of an electron with a charge +e. (It is not possible, of course, with 
an arbitrary electromagnetic field, to separate the solutions of (56) 
definitely into those referring to positive and those referring to negative 
values for cpp+ edo, as such a separation would imply that transitions 
from one kind to the other do not oceur. The preceding discussion is 
therefore only a rough one, applying to the case when such a separation 
is approximately possible.) 

In this way we are led to infer that the negative-energy solutions 
of (56) refer to the motion of a new kind of particle having the mass 
of an electron and the opposite charge. Such particles have been 
observed experimentally and are called positrons. We cannot, how- 
ever, simply assert that the negative-energy solutions represent posi- 
trons, as this would make the dynamical relations all wrong. For 
instance, it is certainly not true that a positron has a negative kinetic 
energy. We must therefore establish the theory of the positrons on 
a somewhat different footing. We assume that nearly all the negative- 
energy states are occupied, with one electron in each state in accordance 
with the exclusion principle of Pauli. An unoccupied negative-energy 
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state will now appear as something with a positive energy, since to 
make it disappear, i.e. to fill it up, we should have to add to it an 
electron with negative energy. We assume that these unoccupied 
negative-energy states are the positrons. 

These assumptions require there to be a distribution of electrons 
of infinite density everywhere in the world. A perfect vacuum is a 
region where all the states of positive energy are unoccupied and all 
those of negative energy are occupied. In a perfect vacuum Maxwell’s 
equation div€ =0 


must, of course, be valid. This means that the infinite distribution 
of negative-energy electrons does not contribute to the electric field. 
Only departures from the distribution in a vacuum will contribute 
to the electric density j) in Maxwell’s equation 

div € = 47jo. (58) 
Thus there will be a contribution —e for each occupied state of posi- 
tive energy and a contribution +e for each unoccupied state of 
negative energy. 

The exclusion principle will operate to prevent a positive-energy 
electron ordinarily from making transitions to states of negative 
energy. It will still be possible, however, for such an electron to 
drop into an unoccupied state of negative energy. In this case we 
should have an electron and positron disappearing simultaneously, 
their energy being emitted in the form of radiation. The converse 
process would consist in the creation of an electron and a positron 
from electromagnetic radiation. 

From the symmetry between occupied and unoccupied fermion 
states discussed at the end of § 65, the present theory is essentially 
symmetrical between the electrons and the positrons. We should 
have an equivalent theory if we supposed the positrons to be the 
basic particles, described by wave equations of the form (11) with —e 
for e, and then supposed that nearly all the states of negative energy 
for the positrons are filled up, a hole in the distribution of negative- 
energy positrons being then interpreted as an ordinary electron. The 
theory could be developed consistently with the hypothesis that all 
the laws of physics are symmetrical between positive and negative 
electric charge. 


XII 
QUANTUM ELECTRODYNAMICS 


74, The electromagnetic field in the absence of matter 

Tue theory of radiation that was set up in Chapter X involved some 
approximations in its handling of the interaction of the radiation 
with matter. The object of the present chapter is to remove these 
approximations and get, as far as possible, an accurate theory of the 
electromagnetic field interacting with matter, subject to the limitation 
that the matter consists only of electrons and positrons. Too little is 
known about other forms of matter, protons, neutrons, etc., for one 
to attempt at the present time to get an accurate theory of their - 
interaction with the electromagnetic field. But there exists a precise 
theory of electrons and positrons, as given in the preceding chapter, 
which one can use for building up a precise theory of the interaction 
of the electromagnetic field with this form of matter. The theory 
must bring in the interaction of the electrons and positrons with one 
another, through their Coulomb forces, as well as their interaction 
with electromagnetic radiation, and it must, of course, conform to 
special relativity. For brevity in this chapter we shall take c = 1. 

We must first consider the electromagnetic field without interaction 
with matter. Now in § 63 we set up first a treatment of the field of 
radiation without interaction of matter. Dynamical variables were 
there introduced to describe the field, commutation relations were 
established for them, and a Hamiltonian was found which made them 
vary correctly with the time. No approximations were made in this 
piece of work. The resulting theory would therefore be a satisfactory, 
exact theory of radiation without interaction with matter, were it not 
for one feature in it, namely our taking the scalar potential to be zero. 
This feature spoils the relativistic form of the theory and makes it 
unsuitable as a starting-point from which to develop a precise theory 
of the electromagnetic field in interaction with matter. 

We must therefore extend the treatment of § 63 by leaving A, 
general and bringing it into the work along with the other potentials 
A,, A,, As. Thus we shall have the four A,, and they will satisfy, as 
the generalization of (62) of § 63, 

OA, = 9, 0A,,/ex, = 0. (1), (2) 
For the present we shall ignore the second of these equations. 
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For the present we shall ignore the second of these equations and 
work only from the first. 


Equation (1) shows that each A, can be resolved into waves 
travelling with the velocity of light. Thus, corresponding to equation 
(63) of § 63, A, (a) = { (AS elke Acy e~ikx) 3h (3) 
where k.x denotes the four-dimensional scalar product 

k.a = ky )—(k,x), 
k, being the 4-vector whose space components are the same as the 
components of the three-dimensional véctor k of § 63 and whose time 
component k, = |k|, and d?k denotes dk, dk, dks, as in § 63. The index 
c in the coefficients A¢, indicates that they are constant in time. We 
shall later introduce some other Fourier coefficients A,,,, not constant 
in time, which must be distinguished from the present ones. 

The Fourier component A‘, has a part Aj, coming from A,(x) and 
a part A‘, (r = 1, 2,3) which is a three-dimensional vector. The latter 
can be decomposed into two parts, a longitudinal part lying in the 
direction of k, the direction of motion of the waves, and a transverse 
part perpendicular to k. The longitudinal part is k,h,/ko’. Aj. The 
transverse part is 

(Oa=/, keg| key?) Ave oan igs (4) 
say. It satisfies buettin—"0. (5) 


It is known from the Maxwell theory of light that only the trans- 
verse part is effective for giving electromagnetic radiation. Chapter X 
dealt only with this transverse part, the A,, of § 63 being the same as 
the present ./%, and equation (65) of § 63 corresponding to the present 
equation (5). Nevertheless, the longitudinal part cannot be neglected 
in a complete theory of electrodynamics because of its connexion 
with the Coulomb forces, as will show up later. 

We can now decompose the three-dimensional vector A,(x) into 


two parts, a transverse part and a longitudinal part. The former is 
A,r) = [ (tig eh? + afc) Pk 


and satisfies 0%, (x)/dx, = 9. (6) 
The longitudinal part may be expressed as the gradient éV/éx, of a 
scalar V given by 


V=i { Keg|teo?- (AS, et? — AS, et") BR. (7) 
Thus A, = &+0V jex,. (8) 
Tt 


3595 57 
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The magnetic field is determined by the transverse part of A,, 
A = curlA = curl 4. 


It is convenient to count A,(x) as longitudinal, so that the complete 
potentials A,,(z) are separated into a transverse part A(x) and a 
longitudinal part A,, 2V/éx,. This separation, of course, refers to a 
particular Lorentz frame of reference and must not be used when one 
wants to keep one’s equations in a relativistic form. 


Each Fourier coefficient A‘, occurs in (3) combined with the time 

factor eto. The product 

Ae erkot — A pk (9) 
say, forms a Hamiltonian dynamical variable in classical mechanics 
and a Heisenberg dynamical variable in quantum mechanics, like the 
the A,,, of § 63. 

The work of § 63 gives us the P.B. relations for the transverse pant 
of A... To connect up with it, we pass over to discrete k-values in 
three-dimensional k-space and take, for example, a particular discrete 
k-value for which k, = k, = 0, kj = ky > 0. Then the polarization 
variable 1 can take on two values referring to the two directions 1 
and 2 and equation (73) of § 63 gives, with the help of the commutation 
relations for the y’s and 7’s, equations (11) of § 60, 


[Ass Are] = [Ag Age] = —15,/47%kpy. (10) 
The work of § 63 gives us no information about Ag, and Ao,. 
However, we can now obtain the P.B. relations for Aj, and Ao, 
from the theory of relativity. Equations (10) have to be built up into 
a relativistic set and the only simple way of doing so is by adding to 
them the two further equations 
[Asx Asx] = —[Aox ox] = —ts,/407ko, a) 
so that the four equations (10) and (11), together with the conditions 
that A yx and A, commute for » #v (as they must do since they 
refer to different degrees of freedom), combine to form the single 
tensor equation [ as Aes 19 so Su [4n%lty. (12) 
We get in this way the P.B. relations for all the dynamical variables. 
Equation (12) can be extended to 
[Ayu Ay] = Guy Sx Out / 4777p. (13) 


Let us now return to continuous k-values. To convert 6,,, to con- 
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tinuous k-values we note that, for a general function f(k) in three- 
dimensional k-space, 


DICK) 3 yy = fk’) = [ f (he) (kk) 8, (14) 
where 6(k—k’) is the three-dimensional 8 function 
B(k—k’) = 5(ky —K,)8 (ey —4)3(ey— Ki). 


In order that (14) may conform to the standard formula connecting 
sums and integrals, equation (52) of § 62, we must have 


8 Oy, = 8(K—k’). (15) 
Thus (13) goes over to 
[Awe Ay] = 29 ,y/ 477k, .6(K—k’). (16) 
This equation, together with the equations 
[Anus Ave] = [Aa Ay] = 0, (17) 


provide the P.B. relations in the theory with continuous k-values. 
It should be noted that these P.B. relations remain valid if we replace 
Am,,, Ay by Aes A‘,. The same P.B. relations apply to the constant 
Fourier coefficients Ajy, Avy. 

We must now obtain a Hamiltonian which makes each dynamical 
variable A,, vary with the time ¢ = z, in the Heisenberg picture 
according to the law (9) with A¢, constant. Calling this Hamiltonian 
H,, we require 


Apes Hy] = dA, y/dxy = tky Ax. (18) 
It is easily seen that this is satisfied by 
Hy = —47? | kA, AM, Pk. (19) 


We therefore take (19), with the possible addition of an arbitrary 
numerical term not involving any dynamical variables, as the Hamil- 
tonian for the electromagnetic field in the absence of matter. 

In § 63 we used our knowledge of the transverse part of the Hamil- 
tonian to obtain the P.B.s of the transverse variables. We have now 
applied the reverse procedure to the longitudinal variables, using our 
knowledge of their P.B.s, obtained by a relativistic argument, to 
find the part of the Hamiltonian that refers to them so as to get 
agreement with (18). 

If we write out the Hamiltonian (19) it appears as 


Hy = 47? [kg Aw Ant Ane Ane t+ Ase Ase Aon dou) Pl 


280 QUANTUM ELECTRODYNAMICS § 74 


The first three terms of the integrand here have a transverse part 
which is just equal to. the transverse energy given by (71) of § 63. 
The last term of the integrand, which is the part of H, referring to 
the scalar potential A,, appears with a minus sign. This minus sign 
is demanded by relativity and means that the dynamical system 
formed by the variables A ox: Ay, is a harmonic oscillator of negative 
energy. Itis rather surprising that such an unphysical idea as negative 
energy should appear in the theory in this way. We shall see in § 77 
that the negative energy associated with the degrees of freedom 
connected with A, is always compensated by the positive energy 
associated with the other longitudinal degrees of freedom, so that 
it never shows up in practice. 


75. Relativistic form of the quantum conditions 

The theory of the preceding section has relativistic field equations, 
namely equations (1). To establish that the theory is fully relativistic 
we must show further that the P.B. relations are relativistic. This is 
not at all evident from the form (16) in which they are written in 
terms of Fourier components. We shall obtain a relativistic form for 
the P.B.s by working out [A,(x), A,(x’)] with « and a’ any two points 
in space-time. We must first, however, study a certain invariant 
singular function that exists in space-time. 
The function 3(x, x") is evidently Lorentz invariant. It vanishes 
everywhere except on the light-cone with the origin as vertex, i.e. the 
three-dimensional space x, 7“ = 0. This light-cone consists of two 
distinct parts, a future part, for which x) > 0, and a past part, for which 
ty <0. The function which equals 8(x, 2+) on the future part of the 
light-cone and —8(x, 2") on the past part of the light-cone is also 
Lorentz invariant. This function, equal to 6(x, x+)x9/|% |, plays 
an important role in the dynamical theory of fields, so we introduce 
a special notation for it. We define 


A(x) = 28(2x, x)a9/|Xo|. (20) 
This definition gives a meaning to the function A applied to any 
4-vector. With the help of (9) of § 15, we can express 8(x, x") in the 


sa 8(«,, aw) = $|x|-48(ey— |X|) 4-8 (ary +|XI)}, (21) 


|x| being the length of the three-dimensional part of x,, and then 
A(x) takes the form 


A(w) = |X|-48(%o—|*])—8(%9+ |X))}- (22) 
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A(x) is defined to have the value zero at the origin, and evidently 
A(—2) = —A(zx). ; 

Let us make a Fourier analysis of A(x). Using d‘*x to denote 
dx ,dx,dx,dx, and d'x to denote dx,dz,dx, we have, for any 4- 
vector kus 


[ A@e= or— ( [x |-48 (arg — |X|) —B (ary + |X|) Jette dtr 
= [ |X| —Hetbox! —e-tkeixle ide) g3qy, 
By introducing polar coordinates |x|, 6, ¢ in the three-dimensional 


a, X_%_ space, with the direction of the three-dimensional part of k,, 
as pole, we get 
[ iahetnndsy: — ff] {eikelxi__ e—ikox!}e—iik'xicos 8) x |sin 8 ddd |x| 


— [ {eto —e-thotx) d|x| i e-ilkiixicos 8 x |sin 6 dO 
0 0 


ioe) 
Skt f ets —e to dn fete etn) 
0 


ice) 
= 2rilk|- i {eilko—ikia_pitke+lka} dq 


= 4ni|k| 48 (Key — ||) —8(ko + Ik |)} 
= 474A(k). (23) 
Thus the Fourier analysis gives the same function again, with the 
coefficient 472i. Interchanging / and z in (23), we get 
Ae) = —i/4n?. [ A(kjei* dt. (24) 


Some of the important properties of A(x) can easily be deduced 
from its Fourier resolution. In the first place equation (24) shows that 
A(z) can be resolved into waves all travelling with the velocity of 
light. To get an equation for this result we apply the operator (_j to 
both sides of (24), thus 


DA(x) = —i/4n?. | A(k) Det d4h = i/4n?. | k,, eA(k)e*= dtke, 
Now k, keA(k) = 0, and hence 
DA(x) = 0. (25) 


This equation holds throughout space-time. We can give a meaning 
to CIA(z) at a point where A(z) is singular by taking the integral 
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of (JA(z) over a small four-dimensional space surrounding the point 
and transforming it to a three-dimensional surface integral by Gauss’s 
theorem. Equation (25) informs us that the three-dimensional surface 
integral always vanishes. 

The function A(x) vanishes all over the three- dimensional surface 
2%) = 0. Let us determine the value of éA(x)/dx, on this surface. It 
evidently vanishes everywhere except at the point x, = 2, = 2%, = 0, 
where it has a singularity which can be evaluated as follows. Differ-— 
entiating both sides of (24) with respect to x», we get 


2) [Oxy = 1/47. f ho ACeet dhe 
= 1/4n?. [Holl -2{8(leo— |k|)—8 (y+ |k|)}e* d4k 


I 


L/4n®. f {8(leo— [k|) +3(ho+ [k|)}e** d4k, 
Putting x) = 0 on both sides here, we get 
[2A(«)/e2ro}r-0 = 14m. | {8(ko—|K|) +3(ley+ [k|)}e~ dh 
= 1/2nt, fetes ath 
= 41 8(x,)8(a)8(x3) = 477 8(x). (26) 
Thus the ordinary 6 singularity, with the coefficient 47, appears at 


the point 7, = 2, = 24, = 0. — 
Let us now evaluate [A,,(x), A,(x’)]. We have from (3), (16), and (17) 


[A,,(z), A,(2’)] 
se i i [A,pe4 A peter, Ay etka 4 A eth) d8fed8h! 
= ig,,[4n?. | | ky-Me-tkngik’a’_ gikng—ik' x} §( kk’) d3kd 3k’ 
= ig,,/4n°. | ky “Me thle—)_gikie—a 3}, _ (27) 


The ky here is defined to be equal to |k| and is thus always positive. 
By putting —k for k in the second part of the integrand, one finds 
that (27) is equal to the four-dimensional integral 


19 yy/47?. | |k|-"{8(ky— |k|)—8(ky+ |k|)}e-* 2) qa 
= ig,,,/4n?. | A(k)e~t@e-2) qAfp, 


in which k, takes on all values, negative as well as positive. Evaluating 
this with the help of (24), we get finally 


[A, (2), 4,(2')] = 9,,A(a—2'), (28) 
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a result which shows that the P.B. relations are invariant under 
Lorentz transformations. 

The formula (28) means that the potentials at two points in space- 
time always commute unless the line joining the two points is a null 
line, i.e. the track of a light-ray. The formula is consistent with the 
field equations (JA,(x) = 0, because (] applied to the right-hand 
side gives zero, from (25). 


76. The dynamical variables at one time 

As a basis for a theory with interaction we must use the dynamical 
variables at one time. The relationships between the dynamical 
variables at one time (i.e. their P.B.s) are not affected by the introduc- 
tion of interaction. On the other hand the relationships between the 
dynamical variables at different times (comprising the field equations 
as well as the P.B-s of variables at different times) are very much 
affected by the interaction. The dynamical variables at one time form 
a non-relativistic concept, but a very important concept in Hamil- 
tonian theory. 

For the case of the electromagnetic field the independent dynamical 
variables at one time are A, and ¢A,,/@2p for all values of 21,2», 2, for 
the given x). The higher time derivatives GA,,/@x,?,..., are not 
independent. Let us put 

= OA, 


= —#. 29 
—_"* - 


Then we have 4,,, Ry. with the suffix x denoting x,, 2, 73, a8 the 
dynamical variables at one time. 
The Fourier resolution of these variables is, from (3) and (9), 


Ayx = { AyetA, xem Ph 


: (30) 
B,. =i / eeu, — A, nee 


[e 
We may reverse the Fourier transformation and express A actA re 
and A,,—A,,-+ in terms of A,, and B,,, respectively. Thus A, and 
A pk are determined by A,,, B,x for all x (at a given x,). The equa- 
tions connecting A,,, Aux with A,,, B,, do not involve the time 
explicitly. Thus the A,,, A,« form an alternative set of one-time 
dynamical variables, on the same footing as the A,,, B,,. 

When we work with the variables A,,, Bx. we shal] need to know 
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their P.B. relations. These may be obtained either from the Fourier 
expansions (30) together with (16) and (17) or from the general P.B. 
relation (28). The latter gives the required results more quickly. 
Putting xj = 2, in (28), we get 


[A Are = 0. (31) 


px 
Differentiating (28) with respect to x, and then putting x) = 2%, we 
get, with the help of (26), 


[Briss Aya) 0G yp) 3(K—R). (32) 
Differentiating (28) with respect to both x) and a and then putting 
Ly = Xo, we get [Bux By] = 0, (33) 


since é?A(x)/dx§ = 0 for zx = 0. Equations (31), (32), and (33) give 
all the P.B. relations between the A ae Bog variables. They show that, 
apart from numerical coefficients, the A,,, can be looked upon as a set 
of dynamical coordinates and the B,, as their conjugate momenta, 
there being a 5 function on the right-hand side of (32) instead of a 
two-suffix 6 symbol on account of the number of degrees of freedom 
being a continuous infinity. 

We can decompose A,, into a transverse and a longitudinal part, 
as shown by equations (8) and (6). We can do the same with B,, and 


get 


0U 
B= Blige (34) 
with 0B,|0x, = 0. (35) 
From (7) with —k substituted for k in the second term of the integrand, 
Vai | ke, leo 2(A gy, +Ay_y Je ake, (36) 
The corresponding equation for U is, since U = eV /dx,y, 
U=— | ke, ky" A gu—Ag_y e-0™ dB, + 437) 
The electric field is given by 
on, 
A,+U) 
@TAot UY) 
ae ou, (38) 
Thus diver— _ Baap 
Ox, 
= —V{A,+0). (39) 


It is evident that any longitudinal variable commutes with any 
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transverse variable. Some useful P.B. relations will now be worked 
out. We shall use the notation—for any field function f,, 


i r Of r’ 
ae, x aa) x * (40) 


If in (32) we put u = 7, v = s and differentiate the equation with 
respect to z,, we get 
[B,.", Aggy] = 4179,55'(X—x') = —478°(X—x’), 
or, from (39), [div &,, A,,.] = 4789(x—x’). (41) 
Now (39) shows that div € is a function only of the longitudinal 
variables, so (41) gives 
[div &,,V%] = 4789(x—x’) = —4n8*(x—x’). 
Integrating with respect to x,, we get 
[div &,,V.] = —4nd(x—x’), (42) 
there being no constant of integration since the field functions &, and 
V, are made up of waves of non-zero wave length. From (42) and (39) 
V7[U,, V,.] = 408(x—x’). 
Integrating with the help of formula (72) of § 38, we get 
[Ug Ve] = —|x—x' |, (43) 
there being no constant of integration or other terms not vanishing 


at infinity on the right-hand side, because U, and V, are made up of 
waves of non-zero wave length. We have from (38) and (43) 


[Eras Ye] = LUI Ve] = —(%,—2;)|K—X' |. (44) 
We shall now obtain the Hamiltonian in terms of the A,, and B,, 
variables. We have from the second of equations (30) 


{ Bx BY, Px 
= {ff kes Wil Ang—A ,_,)(Atp—Atapje Me e™) Bkd3k' Bx 


I 


— 878 i | Keg KA pe — Ay e)(AM ye — AY )8(K+K') dbdBK’ 
= —82° | kAye—Ay 4 AY 4 —AM,) Pb. 
Similarly, from the first of equations (30), 
Ajx’ A¥,! Bx 
=i [FA pet Ay MAM eb AM etme te” Bled hPa 


= sae | Keg*(A ye +A yu)(AM e+ AMy) Bh. 
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Adding and dividing by — 87, we get 
— (80) [ (B, Be A,rA) de 


— —278 | Keg(A yx AY +A, ~ AY») Bk. 


This is equal to H, given by (19), apart from an infinite numerical 
term. The formula (19) for H, already involves an arbitrary numerical 
term, so we may take 


Hp = —(87)-2 | (B, Be+A,"Avr) dx (45) 


with an arbitrary numerical term, different from that of (19). 

The Hamiltonian (45) can, of course, be used to give the Heisenberg 
equations of motion, and the arbitrary numerical term in it does not 
have any effect. One can easily check, using (31), (32) and (33), that 


0A_,/@x) = [A,, Hr] = B,, (46) 
and. 0B,,/0x%) = [B,, Hr] = V*A,, 
agreeing with (29) and (1). It also gives the Schrédinger equation of 
motion ihd|P)/da, = Hy|P> 


for a ket |P> representing a state in the Schrédinger picture. The 
arbitrary numerical term here has the effect of changing |P)> by a 
phase factor, which is not of physical importance. 

We can decompose the expression (45) for H, into a transverse 
part H,,, and a longitudinal part H,;;. We have from (34) 


i B, B, Bx = i (B,+-U"\(B,+U") dx 
= J B,B, Ba + | UU da, 
since the cross terms vanish on account of 
[ 0a, d= — i UB? Bx = 0 
from (35). Similarly we have from (8) 
f AA? dx = | Abs Ba +- | Vroyrs da, 
with the cross terms vanishing again. Thus (45) becomes 
Ay = Hpp+ pz, 
with Hpq = (87) | (B,B,+-00) Ba (47) 
and Hyz, = (82) | (UU VV" By By—AgA,) Px.” (48) 
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It should be noted that the term 
(8¢r)-1 | Abs Bx 
in Hy, can be transformed to 
—(87r)-} { Gf (Bx = —(80)* | A(t 9— of) Gin 
= (80)! | ofe(— of) Px 
= (16) [ (fs— (tf) Dn 
= (87) i A? dx, 
so this term is just the magnetic energy. Some further partial inte- 
grations give 
i Vreyrs dz — | Vryes d3xr, 
so (48) may be written 


Hpy, = (8m) { ((U—Ap(U+Ayl +(V"— By)(V"+B,)} dx. (49) 


77. The supplementary conditions 

We must now go back to the Maxwell equation (2), which we have 
ignored so far. We cannot take this equation over directly into the 
quantum theory without getting inconsistencies. The left-hand side 
of the equation does not commute with A,(x’), according to the 
quantum conditions (28), so this left-hand side cannot vanish. The 
way out of the difficulty was shown by Fermi. It consists in adopting 
a less stringent equation, namely the equation 

(2A,,/2x,)|P> = 0, (50) 
and assuming it to hold for any |P> corresponding to a state that can 
actually occur in nature. There is one equation (50) for each point 
in space-time and these equations must all hold for any ket corre- 
sponding to a state that can actually occur. 

We shall call a condition such as (50), which a ket has to satisfy to 
correspond to an actual state, a supplementary condition. The exis- 
tence of supplementary conditions in the theory does not mean any 
departure from or modification in the general principles of quantum 
mechanics. The principle of superposition of states and the whole of 
the general theory of states, dynamical variables, and observables, 
as given in Chapter II, apply also when there are supplementary 

+ Fermi, Reviews of Modern Physics, 4 (1932), 125. 
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conditions, provided we impose a further requirement on a linear 
operator in order that it may represent an observable. We define a 
linear operator to be physical if it has the property that, when it 
operates on any ket satisfying the supplementary conditions, it pro- 
duces another ket satisfying the supplementary conditions, In order 
that a linear operator may represent an observable it must evidently 
satisfy the requirement of being physical, in addition to the require- 
ments of § 10. 

We have already had an example of supplementary conditions in 
the theory of systems containing several similar particles. The con- 
dition that only symmetrical wave functions, or only antisymmetrical 
wave functions, represent states that can actually occur in nature, is 
precisely of the same type as condition (50) and is what we are now 
calling a supplementary condition. In this theory the requirement 
that a linear operator shall be physical is that it shall be symmetrical 
between the similar particles. 

When we introduce supplementary conditions into our theory we 
must verify that they are consistent, i.e. not too restrictive to allow 
any ket at all to satisfy them. If we have more than one supplementary 
condition, we can deduce further supplementary conditions from them 
by taking P.B.s of the operators in them; thus if we have 

OTP 0; Gv coy (Uy (51) 
we can deduce 


[U,VIIP>=90, [U,[U,VI\P>=0, (52) 


and soon. Toverify that our supplementary conditions are consistent 
we have to look into all the further supplementary conditions obtain- 
able by this procedure to see that they can be satisfied, which we can 
usually do by showing that after a certain point the further supple- 
mentary conditions are all either identically satisfied or repetitions 
of the previous ones. 

We must also verify that the supplementary conditions are in agree- 
ment with the equations of motion. In the Heisenberg picture, for 
which the ket | P> in (51) is fixed, we shall have different supplementary 
conditions referring to different times and they must all be consistent, 
in the way discussed above. In the Schrédinger picture, for which the 
ket | P> varies with the time inaccordance with Schrédinger’s equation, 
we require that if |P> satisfies the supplementary conditions initially 
it satisfies them always. This means that d|P>/dt must satisfy the 
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supplementary conditions, or that H|P) must satisfy the supplemen- 
tary conditions, or that H must be physical. 

It is convenient when we have a supplementary condition U|P> = 0 
to write U 0 (53) 
and to call (53) a weak equation, in distinction to an ordinary or strong 
equation. A weak equation gives another weak equation if it is 
multiplied by any factor on the left, but does not in general give a valid 
equation if it is multiplied by a factor on the right. Thus a weak 
equation must not be used in working out P.B.s. With this way of 
speaking, the requirement (52) that the supplementary conditions are 
consistent becomes the requirement that the P.B.s of the operators 
in the supplementary conditions shall vanish weakly. 

The condition for a dynamical variable £ to be physical is that, for 
each supplementary condition U|P> = 0, we have 


UE|P> = 0, 
and hence Lo aie> — 0. 
Thus the condition is that the P.B. of the dynamical variable with 
each of the operators of the supplementary conditions shall vanish 
weakly. 

Let us now return to electrodynamics. We take equation (2) to be 

a weak equation, so it should be written 

0A,,/ax,, = 0. (54) 
In the Heisenberg picture we have one of these equations for each 
point x. To check their consistency, we take two arbitrary points x 
and 2’ in space-time and form the P.B. 


Ee p(x) a) on 


Coy Ox, 


7a, oa, Ante) Ale 


Evaluating it with the help of (28), we get 


PA(e—x') _ 


fae) = 0 
WY 0x, OX, ages") 


from (25), so the requirements for consistency are satisfied strongly. 
As we have verified that the supplementary conditions are consistent 
at all times in the Heisenberg picture, we have verified that they are 
in agreement with the equations of motion. 

Since equation (54) is only a weak equation, any of its consequences 
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in the ordinary ‘axwell theory will be valid in the quantum theory 
only as weak equations. The equations 
divA = 0, C# /et = —curl & 
follow simply from the definitions of € and¥ in terms of the potentials, 
so they are valid strongly in the quantum theory. The other Maxwell 
equations for empty space, namely 
div € = 0, eSlet ~ curlF, (55) 


are weak equations in the quantum theory, because one needs the 
help of (54) as well as (1) in deriving them. 

The field quantities € and # are components of the antisymmetric 
tensor @A”/éx,,—0A"/dx,. The P.B. of the tensor with the operator 
of (54) at a general point 2’ is 

a Mie ay _ gmt I, 
Ox, ae, ° axe 7 OR, Cit ~ Oke, Om ’ 
It follows that € and # are physical. The potentials A, are not 
physical. 
The supplementary conditions affecting the dynamical variables at 
a particular time are 
“CW me oe = = 
0 he 
Higher differentiations with respect to 2) do not give independent 
equations, but equations which are consequences of these and the 
strong equation (1). Thus in terms of the Schrédinger variables of 
§ 76, the supplementary conditions are 


Bot A,’ = 0 (57) 
and Gait Bh 0. ~ (58) 


Equation (58) is the same as the first of equations (55) and may also 


be written, from (39), 
Va(45+U) = 0. 


Since this holds throughout three-dimensional space, it leads to 


AT PON. (59) 
Noting that A,” = V'", we can now see from (49) that 
tpg 0; (60) 


Thus there ts no longitudinal field energy for states that occur in nature. 


, 


§ 77 THE SUPPLEMENTARY CONDITIONS 291 


To set up a convenient representation, we introduce a standard ket 
10,» satisfying the supplementary conditions 


(Bot+A,")|07> = 0,  (Aot+U)|0-> = 0, (61) 
and also satisfying ia — 0. (62) 


These conditions are consistent, because ./, commutes with the 
operators in (61), and they are sufficient to fix |0,;> completely, apart 
from a numerical factor, because the only independent dynamical 
variables that we have are Ay, By, U, A,’, %,, %,, and of these 
A,+U, B)+A,’, %, form a complete commuting set. With this 
standard ket we can express any ket as 


¥(Ao; Bo, Gu) |9n>- (63) 


Our representation is just the Fock representation so far as concerns 
the transverse dynamical variables <4,, Z,, so ‘Y must be a power 
series in the variables .4,, with different terms in the series corre- 
sponding to the presence of different numbers of photons. The number 
of variables occurring in Y is a continuous infinity, so Y’ is what 
mathematicians call a ‘functional’, 

If the ket (63) satisfies the supplementary conditions, must be 
independent of A, and By, and thus a function only of the #,,. So 
physical states are represented by kets of the form 


V(G) |p): (64) 


with Y a power series in the variables %,. The standard ket |0,> 
itself represents the physical state with no photons present, the perfect 
vacuum. 

Our Hamiltonian H, and its parts Hy, Hp have so far contained 
arbitrary numerical terms. It is convenient to choose these terms so 
that Hpz, Hp are zero for the perfect vacuum. The result (60) shows 
that H,, given by (48) or (49) has the numerical term in it correctly 
chosen to make H,,,, have the value zero for the perfect vacuum, as 
well as for every other physical state. We must take Hyp to be 


Hyp = 40? | eg? tha, ye Pl (65) 


the transverse part of (19), in order that the numerical term in it may 
be correctly chosen to give no zero-point energy for the photons. 
(47) differs from (65) by an infinite numerical term, consisting of a 
half-quantum of energy for each photon state. 
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78. Electrons and positrons by themselves 

We now consider electrons and positrons in the absence of electro- 
magnetic field. The state of an electron is described, as in Chapter XI, 
by a wave function % with four is ica yb, (a = 1, 2, 3, 4), satis- 
fying the wave equation 

ane = —th gh + aq mb (66) 
To get a many-electron theory we shall apply the method of gecond 
quantization of § 65, which involves changing the one-electron wave 
function into a set of operators satisfying certain anticommutation 
relations. 

When we are dealing with 4 at various places at a given time we 
may write it %,, with x denoting x,, 2, 73. Its components are then 
Pu, We pass to the momentum representation with the wave function 
i, by a three-dimensional Fourier resolution 


by = ht f eseony, d?p, yy = het f entsviny, de. (67) 
#, has four components %,,, corresponding to the four components 
of %,. In this representation the energy operator is 
Po = % Ppt mM, 


in which the momentum operators p, are multiplying factors. 
We can separate % into a positive-energy part € and a negative- 


energy part ¢, p= E+, 


€ and { each having four components like 4. In the momentum 
representation they are given by 


Op Ppt Oy Vf, o% Dpto 
= a) m 
So a[ ptm fte Se 5a ee (6) 

since these equations lead to 

Pokp = (% Prt Om Mp = BHM Prt Hn M+ (P+?) Hh, 

= (p?+m)ig,, 

and similarly Posy = ie iy al 
showing that , and ¢, are eigenfunctions of p, with the eigenvalues 
(p?+-m?)? and —(p?+-m?)! respectively. When one is working with 
the operators 


1 tear | 1 pH Oy Ppt On, 
2 


1 
ay 7 (p?+-m?)# J’ (p?+-m?)t J’ 
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one should note that their squares are equal to themselves and their 
product in either order is zero. 

The second quantization makes the %’s into operators like the 7’s 
of § 65, satisfying anticommutation relations like (11’) of § 65. Using 
the notation for the anticommutator 


MN4NM ={[M,N},, (69) 


[bax box l4 a 0, [Pax Pols a 0, 
aa Pox ls — Bap d(x— me), 
the function §(x— x’) appearing in the last equation owing to the x’s 


taking on continuous ranges of values. On transforming to the p- 
representation according to (67), we get 


[Page Pop ls =0, [Pap> Pop’ l+ = 0, 
[aps Bop ]+ = deep — p"). 
With € and ¢ defined again by (68), the last of equations (71) gives 


] , 
(Sap Sop. 7 5(! a case T | [Pops Bayles} =F aE oe 
ge db 


we get 


(70) 


(71) 


(p?-+m?)t (p’?-+-m?)# 
ri L Op Ppt Oy ary 
= 31+ a | PP - 
and similarly 
fy _ Prt o%m™| sry p’ 
[opr bogs = 5)1— Een) 8(p—P') (73) 
and (eae Geely a ene Cop’ ]+ aa Uy 


According to the interpretation of § 65, the operators %,, are 
operators of annihilation of an electron of momentum p and the 
operators yj, are operators of creation of an electron of momentum p. 
To avoid the unphysical notion of negative-energy electrons, we must 
pass over to a new interpretation based on the positron theory of § 73. 
The annihilation of a negative-energy electron is to be understood as 
the creation of a hole in the sea of negative-energy electrons, or the 
creation of a positron. So the operators ¢,, become operators of 
creation of a positron. The positron has the momentum — p, because 
an amount p of momentum gets annihilated. Similarly the Cap become 
operators of annihilation of a positron of momentum — p. Dhe-é,, 
and €,, are operators of annihilation and creation respectively of an 
ordinary, positive-energy electron of momentum p. 

3595 .57 U 
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It should be noted that, although €, has four components, only 
two of them are independent, because the four are connected by 


] Pp hy M —_ 0, 
| (p?-+m?)# *p 
which involves two independent equations. The two independent 
components of £, correspond to the annihilation of an electron in 
each of the two independent states of spin. Similarly ¢, has only 
two independent components, because of the equations 
OX, Pp Xp M t == (f 
(a i irl 
and they correspond to the creation of a positron in the two inde- 
pendent states of spin. 

The vacuum state, for which there are no electrons or positrons 
present, is represented by the ket |0,) satisfying 


Ee One glen )ieen0s (74) 
We can use this ket as the standard ket of a representation. We then 
have any ket expressed as 


ree bap) [Op); 

in which the function, or rather functional, ’ is a power series in the 
variables £,,, Cap. Each term of ¥ is like (17’) of § 65. It must not 
contain any of its variables to a higher power than the first. It corre- 
sponds to the existence of certain (positive-energy) electrons and 
certain positrons, in states specified by the labels of the variables 
appearing in it. 

From (12’) of § 65, the total number of electrons is [ Jip tap dp 
. summed over a. We may write it in the notation of equation (12) of 

§ 67 as oy, dp. Transforming it to the x-representation by (67), 
we get : 


(14 


h-3 | | i eiePihe-iX PIATt f., Badx'dp — i pt thy, dx, 
showing that the density of the electrons is ¢t y,. This result includes 
an infinite constant representing the density of the sea of negative- 
energy electrons. 

We get a quantity of more physical significance if we take the total 
charge Q, equal to the number of positive-energy electrons minus the 
number of holes or positrons, all multiplied by —e. Thus 


Q=—e | (e,—65 8.) dep. (75) 
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We can evaluate this with the help of (68). Using the transpose of 
the second of these equations, namely 


th = wp (1— SEB tone 


(p?-+m?)t 
we get 
- a X, Dp- Oty, M 1 af p,tat, m\ 7 
— ‘fm m ait, rr ™m 
Q e | $5(1 ae (p2-+-m2)t \e i5(! (p2+-m?)! 4 dp. 


Now for any matrix « whose diagonal sum is zero, the anticommutation 
relations (71) give 


by oy + oh ath, = oan(Pap bop t+ Pop: Pap) = aa 9(P— P’) = 0, (76) 


a result which we may assume still holds for p’ = p. Then the 
expression for Q reduces to 


Q= —e | MEY, —ohd,) Bp. 


Transforming it to the x-representation as before, we get 


Q= —e | MY —dg,) Pe, 
showing that the charge density is 
ix a —te(pt pe p,)- (77) 
The interpretation of the one-electron wave function in § 68 gives, 
besides the probability density ~*y, a probability current pla, ~. 
With second quantization we shall have correspondingly a flow of 
electrons, given by the operator Ji a,,. The sea of negative-energy 
electrons produces no resultant flow of electrons, from symmetry, 
and so the electric current is 


je = —ept Oy ig (78) 
The total energy of the electrons is, from formula (29) of § 60, which 
is valid also for fermions, 


Hy = | Brobp Up = | Tyla Petom Mp Pp. (79) 
It becomes, when transformed to the x-representation, 
Hyp = | GL( ito py + o%m ip.) Ba. (80) 


This total energy contains an infinite numerical term representing 
the energy of the sea of negative-energy electrons. 
We get a quantity of more physical significance if we take the energy 
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of all the electrons and positrons, noeomns the energy of the vacuum 
as zero. This quantity is 


Hp = { (p?-+-m?)'(E &,+E4 £,) dp (81) 
Ge 7 Prt %m m 
= [wt mey ta (14 aa Pot 
] Of Dy oh, 
Hb (“pena Fo] 


= | MPL, Pon M)p—bh lak p,-ah, mfp} dp 
+ | (p2+-m) Dp bp +¥h Fp) Bp. (82) 


From (76), the first integral in (82) is the same as (79) and is just 
H,. The second integral is an infinite constant and is minus the 
energy of all the negative-energy electrons of the vacuum distribution. 

We may take either Hp or Hp, as the Hamiltonian. The Heisenberg 
equation of motion for %,, is thus 


Ob ax/ OL) = [Pax Hyp) = [Pax Hp), 


and if we work this out we just get back to the wave equation for %, 
namely (66). 

We must now look into the question of whether the theory is 
relativistic. It is built up from operators y which satisfy the field 
equations (66). These equations are the same as the wave equation 
for the one-electron wave function and are known to be invariant 
under Lorentz transformations, provided ys transforms according to 
the law (20) of Chapter XI. Our present theory goes beyond the 
one-electron theory in that anticommutation relations are introduced 
for the #’s and #’s, and it becomes necessary to verify that these 
anticommutation relations are Lorentz invariant. 

We proceed by a method analogous to that of § 75. We take two 
general points x and 2’ in space-time and form the anticommutator 


Kay(%, 2") = Ya(x)bo(@') + bo(a' ba (2). (83) 


We can evaluate it by working directly from the anticommutation 
relations (71) for the Fourier components of y and g. A simpler way 
is to note certain properties that K,,(x, x’) must have, namely 

(i) it involves x, and «, only through their difference een; 


» 
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(ii) it satisfies the wave equation 
aU ' e 
th — + tha,— — = 
( = aa a) Kel 0 (84) 


on account of (x) satisfying (66); 
(iii) for x) = 2 it has the value 5,,5(x—x’), as follows from the 
third of equations (70). 


These properties are sufficient to fix K,,(x, x’) completely, since (iii) 
fixes it for x) = 2), (ii) shows how it depends on 2p, and (i) then shows 
how it depends on 2). The solution is easily seen to be 


Fanls2") = h-> [FHL (a, 0, bom) Pola e ==" dp, (85) 


where the 5 means a summation over the two values + (p?-+-m?)? tor 
pp With particular values for p,, p2, ps. It satisfies (ii) since the operator 
in (84) produces the factor (py—o,p,—%,m™) in the integrand of 
(85), which factor gives zero when multiplied on the left into the 
factor {}. It satisfies (iii) since, with x) = 2, the summation over pg 
makes the second term in {} cancel out. 

The law of transformation for % and % given in § 68 has the effect 
of making the quantities $'(x’)x, ¥(x) transform like the four com- 
ponents of a 4-vector and making bt (2')o, o(x) invariant. Thus 

Le (cc! )oxy, yo(ar) + Sept (2' )otyn (22) (86) 
is invariant with 4 any 4-vector and S any scalar. The invariance 
of (86) must be sufficient to ensure the correct transformation law 
for ys and jf, since it enables one to deduce the invariance of the wave © 
equation for , by taking “ = ihd/éx,, S = —m. 

The invariance of (86) leads to the invariance of 

(Wo, + Som) an Pal” yo(x) +o Pal’ )}- 
Thus (Ha, + Som) ab KGa’) (87) 
should be invariant with K,,(z,x’) given by (85), and its invariance 
would be sufficient to ensure the invariance of the anticommutation 
relations. We get for (87) 


h-* i DY Flay, + Seem )a( Pot op Prt Om ™)pq epg? dep 
=e i Y Hlo—ly ut Som) Pot % Prt om) faa PM * dip 
=h* | Y 2p Po— LP + Smee Plhpg? dp. (88) 


This is Lorentz invariant because the differential element py ' d*p is 
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Lorentz invariant. Thus the relativistic invariance of the theory is 
proved. 


79. The interaction 

The complete Hamiltonian for electrons and positrons interacting 
with the electromagnetic field is 
where H, is the Hamiltonian for the electromagnetic field alone, 
given by (19) or (45), Hp is the Hamiltonian for the electrons and 
positrons alone, given by (80) or (81), and Hg is the interaction energy, 
involving the dynamical variables of the electrons and positrons as 
well as those of the electromagnetic field. We take 


Hy = | Avy, dx, (90) 
with j,, given by (77) and (78), as we shall see that this gives the correct 
equations of motion. Thus, with neglect of infinite numerical terms, 

H = { {f'a,(—tip—eAnp) + Pty mp—feA(Pry—yrp)} dee — 
— (87r)-} | (B, Be+A,"Aur) dx, (91) 


Let us work out the Heisenberg equations of motion that follow 
from the Hamiltonian (91). We have 


th Op ax/|OXy — ate iy. = Pax(Hp+ Hy ) ia (Hp+ Hy Vax: 
— i [Yas Pox] +{o,,( — hip,” —€A fo p.’) oa 
48 Orn Mp: —eA a pe b Yo 


= {or,( — hep! —eA*, a) a Xm mip,—eA Pha: 
are 

Thus {au{ih ge + edt — ay ml — 0. , (92) 
This agrees with the one-electron wave equation (11) of Chapter XT. 
Since H is real, the equation of motion for % will be the conjugate of 
the equation of motion for y and so will agree with (12) of Chapter XI. 
Thus the interaction (90) gives correctly the action of the field on the 
electrons and positrons. Further we have, making use of the P.B. 
relations in (46), 


2A 2 = (A, H] = (A, Hy] 
= B, (93) 
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and BIB 05D ig hD) 1B ses Ep lt lB] 
=a eat | es A” elatene Ba’ 


SN a as ai) (94) 
(93) and (94) lead to DA, = 47), (95) 


which agrees with the Maxwell theory and shows that the interaction 
(90) gives correctly the action of the electrons and positrons on the 
field. 

To complete the theory we must bring in the supplementary con- 
ditions (54). We must verify that they are in agreement with the 
equations of motion. The method used in § 77, which consisted in 
showing that the supplementary conditions at different times in the 
Heisenberg picture are consistent with one. another, is no longer 
applicable, because the quantum conditions connecting dynamical 
variables at different times get altered by the interaction in a way 
that is too complicated to be worked out. So we shall obtain all the 
supplementary conditions affecting the dynamical variables at one 
instant of time and check whether they are consistent. 

We have again equations (56). A further differentiation with respect 


SOME | (0A, Jaa, 0. (96) 
Now the equation of motion for %, namely (92), leads, as in § 68, to 


A(pra, p)/ex, = 9, 
This is the same as 2j,,/Ox, = 9, (97) 


because the difference between —eJt and j, is constant in time, even 
though it is infinite. From (95) we now see that (96) holds as a strong 
equation. Thus equations (56) are the only independent supplemen- 
tary conditions affecting the dynamical variables at one instant of 
time. The first of them gives (57), as before, and the second now gives, 
with the help of (95) for » = 0, 


(Ap’ + B,)' +4ajq = 0. (98) 
This may be written 

(Ag+ U)"+41j) = 0 (99) 
or, from (39), div €—47j, = 0, (100) 


and is just one of the Maxwell equations. 
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One can see without detailed calculation that, for any two points 


x and x’ at the same time, 
[JoxsJox'] = 9, 

since, from the form of (70), the P.B. must be a multiple of 5(x— x’) 
and cannot contain derivatives of (x — x’), while also it has to be anti- 
symmetrical between x and x’. Thus the extra terms 47j9, in equa- 
tions (98) for various values of x, as compared with the corresponding 
equations (58), commute with one another as well as with all the 
other dynamical variables occurring in (58) and (57). It follows that 
these extra terms will not disturb the consistency of (58) and (57), 
and hence (98) and (57) are consistent. 

Our method of introducing interaction into the theory was not 
relativistic, since the interaction energy (90) involves the dynamical 
variables at an instant of time in some Lorentz frame. It therefore 
becomes questionable whether the theory with interaction is a rela- 
tivistic one. Our field equations, namely (92) and (95), are evidently 
relativistic and so are the supplementary conditions (54). It remains 
uncertain whether the quantum conditions are Lorentz invariant. 

We know the quantum conditions connecting all our dynamical 
variables A,,, Buxs tax: Bax at a given time xz). We cannot, as men- 
tioned above, work out the general quantum conditions connecting 
dynamical variables at any two points in space-time, because the 
interaction makes it too complicated. We shall therefore make an 
infinitesimal Lorentz transformation and work out the quantum con- 
ditions at a given time in the new frame of reference. If we can estab- 
lish that the quantum conditions are invariant under infinitesimal 
Lorentz transformations, their invariance under finite Lorentz trans- 
formations will follow. 

Let xj be the time coordinate in the new frame of reference. It is 
connected with the original coordinates by 

ot aa + vet, (101) 
where ¢ is an infinitesimal number and », is a three-dimensional vector, 
ev, being the relative velocity of the two frames. We shall neglect 
terms of- order e?. 

A field quantity « at the place x at the time 2% in the new frame 
has the value 

K(X, 2h) = «(X, %y)+ (x9 —2p) Ox, /Oxq = K(X, Lo) + €v,2,[K,, H]. 
(102) 
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Its P.B. with another such field quantity d(x’, x*) is 
[oe(x, 28), ACK’, a8)] = [le(X, 29) Hed, wplieg, HI MX’, a9) +0, 24.y, HT] 
= [k(X, pq), A(X’, @)]+-€v, [ky [Ay, H+ 
ed, 2, [igs Hy 
= [1e(X, 9), A(X’, w)] Her, (24 —2,) Legs gs H+ 
+, %,[[Ky, Ay], 1]. (103) 
If « and A are ¢% or ¢ variables, we should be interested in their anti- 


commutator instead of their P.B. Using the notation (69) for the 
anticommutator, we have 


(aolocuacy PACK. 250).. 
= [K(X, 29), A(X", %o)]4. + €%, Hi [kgs Ay, A), +0, @,[[ky, A], Ay], 


= [«(X, %), A(X’, 29) ]4 + €0, (2, — 2, ) [kx [Ags A]] 4+ €0, @ [Lk Av], 1]. 
(104) 
With « and X any two of the basic variables A n> Beale o,, the P.B. 
[«,,A,] or anticommutator [x,,A,-],, as the case may be, is a number, 
and so the last term in (103) or (104) vanishes. We are left with 
[«(x, 29), A(X’, 2) ],. = [«(X, Xo), A(X’, %p)], 4 

+€v,(0,—2,)_eas xs Hp-+ Hp], +e09(2— 2) Aner Ho] (108) 
where [«, A], denotes the P.B. or the anticommutator, as the case may 
be. From the form (90) for Hg we see that [A,,, Hg] can involve only 
the dynamical variables A,,,., pay’, $a, and cannot involve any deriva- 
tives of these variables. It follows that [x,,[A,,Hg]],, if it does not 
vanish, will be a multiple of 8(x— x’) and will not contain terms with 
derivatives of §(x—x’). Hence the last term of (105) vanishes. We 
can conclude that [«(x, 2%), A(x’, x3)], has the same value as when 
there is no interaction, and is thus Lorentz invariant from our earlier 
work. 

A possible criticism of the above proof should be noted. At several 
places we worked out expressions in powers of « and neglected ¢’. 
Such a procedure cannot be valid for calculating [x(z), A(x’)],, with x 
and x’ two general points in space-time lying close together, so that 
Oa is of order «, because the result of the calculation should be 
a function of the (x,—x,)’s having a singularity when the 4-vector 
x—z’ lies on the sigiliecasite and such a function, of course, cannot be 


expanded as a power series in the (x,—%,)’8. 


xX 
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To validate the argument we should reformulate it so as to avoid 
the use of the 8 function. Instead of evaluating [«(x, 29), A(X’, %)].., 
we should evaluate 


al OK (X25 dew, i bo X53) dee’ (106) 


where a, and b, are two arbitrary continuous functions of x, X2, £3. 
Then the quantities that we need to expand in powers of « all vary 
continuously with a continuous change in the direction of the time- 
axis, and the expansions are justifiable. The equations that we now 
get are those of the previous argument multiplied by a, b,, d?xad*a’ 
and integrated. We are led to the same conclusion—that the P.B. 
or anticommutator has the same value as when there is no interaction. 
It will be seen that the reason why the interaction does not disturb 
the quantum conditions is because it is so simple, involving only the | 
basic dynamical variables and not their derivatives. The P.B.s and 
anticommutators have the same values as with no interaction pro- 
vided they refer to variables at two points in space-time that are at 
the same time with respect to some observer. This means the two 
points must be outside each other’s light-cones and may approach 
coincidence only along a path lying outside the light-cone. 


80. The physical variables 
A ket |P> that represents a physical state must satisfy the supple- 
mentary conditions 
(B)+A4A,")|P) = 0, (div €—47j))|P> = 0. (107) 
A dynamical variable is physical if, when multiplied into any ket 
satisfying these conditions, it gives another ket satisfying these con- 
ditions. This requires that it shall commute with the quantities 
By +A,’, div €— 4775. * (108) 
Let us see what simple dynamical variables have this property. 

The transverse field variables %,, Z, evidently commute with the 
quantities (108) and are physical. The variable 4, commutes with 
the first of the quantities (108) but not the second and is thus not 
physical. We have 


Ml baxs Pox’ | (Yax Poet Pox’ tax) Pox’ 
= 84y8(X—X' yy = Pa, 5(X—X’). 
Thus [Yan dox] = te/F hag 3(X—X’). (109) 
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From (42) 

[erePxlh, div €..] = 4rrie/h .e%¥xh8(x— x’), 
Hence 
[e eo div Ey — Aaj oy] = [eters div Ey ax — 40x Jox’] 

= 0. 
“i if we put Ye — CN, (110) 


x commutes with both expressions (108) and is physical. Similarly 
is physical. The variables ¥,, Z,, 4*, J* are the only independent 
physical variables, apart from the quantities (108) themselves. 

We have 

—fe(fttyr— yrs), fj, = —ef*ta* (111) 

Thus the charge density and current are physical. Also it is easily 
seen that € and ¥ are physical, just as in the case when there are 
no electrons and positrons present. All those variables are physical 
that are unaffected by the arbitrariness that exists in the electro- 
magnetic potentials in the Maxwell theory. 

The operator ¥,, represents the creation of a positron or the 
annihilation of an electron at the — x. Let us see what is the 
physical significance of the operator ¥*,. From (44) 


ieteVah, 6) — coh a, ah )|x—x'(-3, 
and hence thie, ©.) velit (aw,—x,)|x—x'|-8 
or Eng Vite = WhlE re + (et, —2,)|x’—x|-9}. (112) 
Take a state |P) for which €, at a certain point x’ certainly has the 
numerical value c,, so that 


CaP) = Ae eo 
Then from < 
re 0 x|P) = = {c,+e( w, pH) |X! — x|-3}ph. Pay 
so for the state %%,|P), €, at the point x’ certainly has the value 
c, + e(x,—a,)|x’ —x|~>. 

This means that the operator *,, besides creating a positron or annihi- 
lating an electron at the point x, increases the electric field at the 
point x’ by e(x,—2x,)|x’—x|-°, which is just the classical Coulomb 
field at x’ of a positron with charge e at the point x. Thus the operator 
~*, creates a positron at the point x together with its Coulomb field, 
or else annihilates an electron at x together with its Coulomb field. 

For electrons and positrons interacting with the electromagnetic 
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field it is the variables *, ~*, rather than the variables ¢, op, that 
correspond to the physical processes of creation and annihilation of 
electrons and positrons, since these processes must always be accom- 
panied by the appropriate Coulomb change in the electric field around 
the point where the particle is created or annihilated. It is easily seen 
that the variables #*,, J*, satisfy the same anticommutation relations 
(70) as the unstarred variables. When we pass to the momentum 
representation the important quantities will be, not the unphysical 
variables ys, defined by (67), but the physical variables ps defined by 


UE ht | etominys dp, Ye = Wf etorings dx, (113) 
We must now replace (68) by 


gt — Oy Ppt Xm + |e fe = 3! = Op Dp t Mm r| “ 
27 (ptm?) 2| (ptm 
and take é* to represent the annihilation of an electron of momentum 
p, & the creation of an electron of momentum p, (5 the creation of 
a positron of momentum —p and ¢* the annihilation of a positron 
of momentum —p. The variables #%, $%, E%, E*, CF, CK will all satisfy 
the same anticommutation relations as the corresponding unstarred 
variables. 

We can express the Hamiltonian entirely in terms of physical 


variables. We have 


per — eV l(b + te/h. Vs). 


[I+ 


Thus 
Hp+Hg = [ (Ptoy[ ih —e( 0" — Vb] + Pray myp-+-A%o} Bx 
= | (Pa,( ip esd p*) + "a, mp*+ A%,} dex, 
The last term in the integrand here should be combined with H,. 
From (49) and (57) . 


Hp, & —(8n)-* | (U—Ay)(U+-Ag)” Hx 
4 | (U-Ap) jy da 
with the help of (99). Thus 
Hpy+ [ A%, Bx = 4 | (Ut+Ap) jo Pe. 
Integrating (99) with the help of formula (72) of § 38, we get 


Ag+ Ue | —/0s rane 
a 


§ 80 THE PHYSICAL VARIABLES 305 


and hence 
App+ il A%, Ba = P | { JoxJox’ g3yd3z', 
2 |x— x’ | 
Thus we get Hell 
with 


H* = f {B+ ce, — ihiyb? — el p*) + Pte, mip} Bx-+ 


l coy 
H. JoxI 0x’ 73] 8p! 
ai eot5|{ — (114) 


We may, use H* instead of H as our Hamiltonian. It leads to tne 
same Schrodinger equation for a physical ket, since if | P) is physical 
are =|». 
Also it leads to the same Heisenberg equations of motion for physical 
variables, since if € is a physical variable 

[é,H*] = (6, H]. 
Thus H* and H are equivalent Hamiltonians for the physical quanti- 
ties, and the others do not matter. 

H* involves only physical variables. The longitudinal field variables 
do not appear in it. Instead of them we have the last term of (114), 
which is just the Coulomb interaction energy of any charges that are 
present. The appearance of such a term in a relativistic theory is 
rather strange, as it is an energy associated with the instantaneous 
propagation of forces. It appears as a result of our having transformed 
the theory a long way from the Heisenberg form in which the relati- 
vistic invariance of the theory is manifest. 

We could set up a representation by taking as standard ket the 
product of the standard ket |0,-> for the electromagnetic field alone, 
given by (61) and (62), with the standard ket |0p> for the electrons 
and positrons alone, given by (74). This representation would not be 
a convenient one, however, because its standard ket does not satisfy 
the second of the supplementary conditions (107). 

We get a more convenient representation if we take another stan- 
dard ket |Q> satisfying 


(Bot A,)|Q> =0, - (div€—4nj.)|Q> = 0, (115) 
BZa\Q>=0, ylQ>= 0, apl@ >=. (116) 
These conditions are consistent, because the operators on |Q)> in 


them all commute or anticommute with each other, and there are 
enough of them tofix |Q> completely, apart from a numerical factor, 
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because there are as many of them as of the conditions for |0,>|0p>. 
The conditions (115) show that |Q> satisfies the supplementary con- 
ditions and so represents a physical state. The conditions (116) show 
that |Q> represents a state for which there are no photons, electrons, 
or positrons present. 

Any ket |P» that satisfies the supplementary conditions (107) and 
so represents a physical state.can be expressed as some physical 
variable multiplied into |Q>. The only independent physical vari- 
ables that give non-vanishing results when applied to |Q> are W,,, 

tpi ap: TLewie _ ae ee 
|P» i: Bal: Ate tei Cn |Q>. (117) 
Thus | P> is represented by a wave functional ¥ involving the variables 
en ee, &.. It is a power series in these variables, the various terms 
in it corresponding to the existence of various numbers of photons, 
electrons, and positrons, with the Coulomb fields around the electrons 
and positrons, 

In using the representation (117) together with the Hamiltonian H*, 
we have a form of the theory in which we can ignore the conditions 
(115), as they have no effect on the kets (117). We must retain the 
conditions (116). The longitudinal variables then no longer appear 
in the theory. 


81. Interpretation 

The foregoing work establishes the basic equations of quantum 
electrodynamics. There are two forms of the theory, involving the 
Hamiltonians H and H* respectively. We must now consider the 
interpretation and application of the theory. We shall take the H* 
form for definiteness. The argument would be essentially the same 
with the H form. Y 

The ket |Q> represents a state for which there are no photons, 
electrons, or positrons present. One would be inclined to suppose this 
state to be the perfect vacuum, but it cannot be, because it is not 
stationary. For it to be stationary we should need to have 


H*|Q> = C|Q) 


with C a number. Now H* contains the terms 


—e | Uo, atys dais | { 2s 


oT 3s (118) 
x"| 


which do not give numerical factors when applied to |Q> and which 
therefore spoil the stationary character of |Q). 
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Let us call the state Q represented by |Q> the no-particle state at 
a certain time. If we start with the no-particle state it does not remain 
the no-particle state. Particles get created where none previously 


existed, their energy coming from the interaction part of the Hamil- 
tonian. 


To study this spontaneous creation of particles, we take the ket 
|Q> as initial ket in the Schrodinger picture and treat the terms (118) 
as a perturbation giving rise to a probability of the state Q jumping 
into another state, in accordance with the theory of §44. The first of 
them, resolved into its Fourier components, contains a part 


—elap)an | f Eh SrrunPhd*p, (119) 


which causes transitions in which a photon is emitted and simul- 
taneously an electron-positron pair is created. After a short time 
the transition probability is proportional to the squared length of the 
ket formed by multiplying (119) into the initial ket |Q>, which is 


€7(&,)an(%s)ea x 
x [fff Ole run hp Ma Mu Ely Sor swnl Q> PhatpPh Ep’ 
= €(%,)anladea [ {ff Ql a 1% 


x ge es ae lip swnl+|Q> Bhd pdk'd*p'. 
Using the values of the P.B. and anticommutators given by (4), (16), 
(72), (73), we get an integrand which depends on the k, k’ variables 
according to the law |k|-15(k—k’) for large values of k and k’. This 
gives an integral that diverges, so the transition probability is infinite. 

The second term of (118), resolved into its Fourier components, 
contains terms like 2% 2*. Cf. Ch yp, which cause transitions in which 
two electron-positron pairs are created simultaneously. One can 
calculate the transition probability as before, and one finds again 
that it is infinite. From these calculations one can conclude that the 
state Q is not even approximately stationary. 

A theory which gives rise to infinite transition probabilities of 
course cannot be correct. We can infer that there is something wrong 
with quantum electrodynamics. This result need not surprise us, 
because quantum electrodynamics does not provide a complete 
description of nature. We know from experiment that there exist 
other kinds of particles, which can get created when large amounts of 
energy are available. All that we can expect from a theory of quantum 
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electrodynamics is that it shall be valid for processes in which there 
is not enough energy available for these other particles to be created 
to an appreciable extent, say for energies up to a few hundred MeV. 
Thus the high-energy part of the interaction energy (118) is quite 
unreliable, and it is this high-energy part that is responsible for the 
infinities. 

It appears that we must modify the high-energy part of the inter- 
action. At present there does not exist any detailed theory of the other 
particles and so it is not possible to say how it ought to be modified. 
The best we can do is to cut it out from the theory altogether, and so 
remove the infinities. The precise form of the cut-off and the energy 
where it is applied will be left unspecified. Of course, the cut-off 
spoils the relativistic invariance of the theory. This is a blemish 
which cannot be avoided in our present state of ignorance of high- 
energy processes. 

Even with a cut-off the no-particle state Q is not approximately 
stationary. It therefore differs very much from the vacuum state. 
The vacuum state must contain many particles, which may be 
pictured as in a state of transient existence with violent fluctua- 
tions. 

Let us introduce the ket |V> to represent the vacuum state. It is 
the eigenket of H* belonging to the lowest eigenvalue. Here and sub- 
sequently H* denotes the expression (114) modified by the cut-off. 
One might try to calculate |V) as a perturbation of the ket |Q>, but 
such a method would be of doubtful validity, because the difference 
between |V> and |Q) is not small. No satisfactory way of calculating 
|V> is known. In any case the result would depend strongly on the 
cut-off, and since the cut-off is unspecified the result would not be a 
definite one. 2 

It follows that we must develop the theory without knowing |V)>. 
This is not a great hardship, because we are not mainly interested in 
the vacuum state. We are mainly interested in states which differ 
from the vacuum through having a few particles present in addition 
to those associated with the vacuum fluctuations, and we want to 
know how these extra particles behave. For this purpose we focus our 
attention on an operator K representing the creation of the extra 
particles, so that the state we are interested in appears as K|V). 

We do not know how the ket |V> varies with the time in the Schro- 
dinger picture, since we do not know the lowest eigenvalue of H*. To 
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avoid this difficulty we work in the Heisenberg picture in which |V)> is 
constant. We then require K|V> to represent another state in the 
Heisenberg picture and thus to be another constant ket. This leads to 


dK |dt = 0. (120) 


Usually K will involve the time explicitly as well as Heisenberg 
dynamical variables, so (120) gives 
whOK /ot+-KH*—H*K = 0. (121) 

We now have each physical state determined by a solution K of 
(120) or (121). We obtained this result without knowing the vacuum 
ket |V>, and we can proceed to study K without knowing |V>. The 
only further information about K that we would have if we did know 
|\V> would be that two K’s, say K, and K,, would correspond to the 
same state if we had (K,—K,)|V> = 0. But we can get on without 
this further information and count all different K’s satisfying (121) 
as corresponding to different states. 

We are thus led to a drastic alteration of one of the basic ideas of 
quantum mechanics, namely to represent a state by a linear operator and 
not a ket vector. This alteration is brought about by the complexities 
of applying quantum mechanics to a field and by our ignorance of 
high-energy processes. 

A trivial solution of (120) or (121) is K = 1. This evidently corre- 
sponds to the vacuum state. 

A general solution may be put in the form of an explicit function of 
t and of the dynamical variables at time ¢. Let us use the symbol 7, 
to denote collectively the emission operators at time ¢. Thus 7 
equals one of the variables 7,,, p, Cap at the time ¢ in the Heisenberg 
picture. The absorption operators are then 7. A solution of (121) then 
— K = f(t,m- (122) 
We require some physical interpretation for the state represented by 
this K, as the usual physical interpretation of quantum mechanics, 
requiring a state to be represented by a ket, is no longer applicable. 
We shall need to make some new assumptions. 

Keeping to the Heisenberg picture, we introduce at each time ¢ the 
ket |Q,> satisfying the conditions (116) with respect to the Heisenberg 
dynamical variables at time t. These conditions may now be written 


m1Q> = 9. 
The ket |Q,> corresponds to no particles existing at the time ¢ and it 
provides a reference ket for the discussion of general states at time ¢. 
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For any state fixed by a solution K of (121) we form K|Q,) and 
assume that this ket determines what can be observed at the time t and is 
to be interpreted according to the standard rules. We obtain K in the 
form (122) and then arrange it so that in each term all the absorption 
operators 7, are to the right of all the emission operators 7,. It is then 
said to be in the normal order. Any term in K containing an absorption 
operator then contributes nothing to K|Q,>. The surviving terms in 
K|Q,> will contain only emission operators, like (117). Each surviving 
term is associated with certain particles in particular states, and the 
square of the modulus of its coefficient (with the appropriate factors 
n! when there is more than one boson in the same state) is assumed to 
be, after normalization, the probability of these particles existing in 
these particular states at the time t. 

We now have a general method of physical interpretation which is 
rather similar to the usual one, but there are important differences. 
A term in K with an absorption operator on the right will not con- 
tribute to K|Q,> and so will not contribute anything observable at 
time t. We may call it a latent term at the time ¢. Such a term cannot 
be discarded as non-existent, because it will contribute observable 
effects at other times. These latent terms are a new feature of the 
theory and are to be understood as an incompleteness in the descrip- 
tion of a state in terms merely of the particles which can be observed 
to be present at a certain time. 

As a consequence of the occurrence of latent terms, if K\Q, is 
normalized at one time, it will usually not be normalized at other times. 
We thus have to carry out a separate normalization for each time in 
order to derive the probabilities. ad 


82. Applications 

There are two important applications of the foregoing theory in 
which effects are calculated that cannot be obtained from a more 
primitive theory. These applications are concerned with a single 
electron in a static electric or magnetic field. As a consequence of the 
interaction of the electron with electromagnetic waves, the energy 
levels are shifted somewhat from their values given by the elementary 
theory. The important cases are: 

(i) An electron in the Coulomb field of a proton. The theory here 
leads to a shift in the energy levels of the hydrogen atom. It is 
named the Lamb shift, after its discoverer. 

(ii) An electron in a uniform magnetic field. The extra energy is 


APPLICATIONS 311 


here interpreted as arising from an extra magnetic moment of 
the clectron, called the anomalous magnetic moment. 

To take a static field into account one merely has to introduce 
potentials to deseribe it and add them on to the potentials in the 
Hamiltonian. The potentials of the static field are functions of 
2, %, Xz only, and are numbers for each x,, 2y, x3, not dynamical 
variables, so their introduction does not increase the number of degrees 
of freedom. 

The calculations of the Lamb shift and anomalous magnetic moment 
are rather complicated. They are given in detail, working from the 
Hamiltonian H, in the author’s book Lectures on Quantum Field 
Theory (Academic Press, 1966). The results are in good agreement with 
experiment and provide a confirmation of the theory. 

These calculations were made in terms of the Heisenberg picture 
throughout. One may tackle quantum electrodynamics on the 
Schrédinger picture, looking for a solution of the Schrédinger equation 
by taking the no-particle ket, or a ket corresponding to just a few 
particles present, as the initial ket of a perturbation procedure and 
applying the standard perturbation technique. One finds that the 
later terms are large and depend strongly on the cut-off, or are 
infinite if there is no cut-off. The perturbation procedure is not 
logically valid under these conditions. 

Nevertheless people have developed this method a long way and 
have devised working rules for discarding infinities (in a theory 
without cut-off) in a systematic manner, so that finite residual effects 
remain. The procedure is described in many books, e.g. Heitler’s 
Quantum Theory of Radiation (Clarendon Press, 1954). The original 
calculations of the Lamb shift and anomalous magnetic moment were 
carried out on these lines, long before the corresponding calculations 
in the Heisenberg picture. The results are the same by both methods. 

I do not see how these calculations based on the Schrodinger 
picture, supplemented by some working rules, ‘can be presented as a 
logical development of the standard principles of quantum mechanics. 
The Schrodinger picture is unsuited for dealing with quantum electro- 
dynamics, because the vacuum fluctuations play such a dominant role 
in it. These fluctuations present great mathematical difficulties, and 
also they are not of physical importance. They get bypassed when one 
uses the Heisenberg picture, and one is then able to concentrate on 
quantities that are of physical importance. 
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Quantum mechanics may be defined as the application of equations 
of motion to atomic particles. It was first shown that atomic particles 
are subject to equations of motion when Bohr set up his theory of the 
hydrogen atom. The big development was made when Heisenberg 
discovered the need for non-commutative multiplication. The domain 
of applicability of the theory is mainly the treatment of electrons and 
other charged particles interacting with the electromagnetic field— 
a domain which includes most of low-energy physics and chemistry. 

Now there are other kinds of interactions, which are revealed in 
high-energy physics and are important for the description of atomic 
nuclei. These interactions are not at present sufficiently well under- 
stood to be incorporated into a system of equations of motion. 
Theories of them have been set up and much developed and useful 
results obtained from them. But in the absence of equations of 
motion these theories cannot be presented as a logical development 
of the principles set up in this book. We are effectively in the pre- Bohr 
era with regard to these other interactions. 

It is to be hoped that with increasing knowledge a way will even- 
tually be found for adapting the high-energy theories into a scheme . 
based on equations of motion, and so unifying them with those of 
low-energy physics. 
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contravariant, 254. 

Coulomb interaction energy, 305. 

covariant, 254. 

cut -off, 308, 

de Broglie waves, 120. 

degenerate system, 171. 

dependent, 16, 17. 

diagonal element, 68. 


diagonal in a representation, 74, 

— matrix, 68, 69, 70. 

— with respect to an observable, 77. 
displacement operator, 102. 

dual vector, 18. 

Ope» 62% 

6 function, 58. 

A function, 280. 


e, 157. 

eigen, 30. 

eigenfunction, 117. 

Einstein’s photo-electric law, 7. 
element of a matrix, 68. 

even permutation, 208. 
exclusion principle, 211. 
exclusive set of states, 215. 


Fermi statistics, 210. 

fermion, 210. 

Fock’s representation, 139, 228. 
functional, 291. 


Gibbs ensemble, 131. 
Green’s theorem, 191. 
group velocity, 120. 


h, h, 87. 

half-width of absorption line, 204, 
Hamiltonian, 113, 114. 
Hamilton-Jacobi equation, 122. 
Heisenberg dynamical variable, 113. 
— picture, 112. 

— representation, 117. 

Hermitian matrix, 68, 69. 

Hilbert space, 40. 

holes, 252. 


identical permutation, 212. 
improper function, 58. 
independent, 16, 17. 
intermediate state, 175. 


ket, 16. 
Kramers-Heisenberg dispersion for- 
mula, 248. 


Lagrangian, 128. 
Landé’s formula, 184. 
length of a bra or ket, 22. 
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linear operator, 23. 
longitudinal energy, 286, 290. 
— field, 277. 


magnetic anomaly of the spin, 166. 
— moment of electron, 165, 266. 
magnitude of angular momentum, 146. 
matrix, 68, 69. 

Maxwell’s equations, 290, 299. 
momentum representation, 96. 
multiplet, 182, 223. 


non-degenerate system, 171. 
no-particle state, 308. 
normal order, 310. 

normal state, 139. 
normalization, 22. 


observable, 37, 288. 

— having a value, 46. 

~— having an average value, 46. 
odd permutation, 208. 

orbital variable, 220. 

— angular momentum, 142, 148. 
orthogonal bras, kets, 21. 

— representation, 54. 

— states, 22, 35. 

orthogonality theorem, 32. 
oscillator, 136, 227. 


P.B., 85. 

Pauli’s exclusion principle, 211. 
permutation, 208, 211. 

phase factor, 22. 

— space, 131. 

physical variable, 288. 

Planck’s constant, 87. 

Poisson bracket, 85. 

positive square root, 45. 
positron, 274. 

probability amplitude, 73. 

— coefficient, 180. 

— current, 260. 

— density, 258. 

—— of observable having a value, 47. 
proper-energy, 179. 


quantum condition, 84. 


radial momentum, 153. 

real linear operator, 27. 
reciprocal of an observable, 44. 
— permutation, 212. 
reciprocity theorem, 76. 
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relative probability amplitude, 73. 
representation, 53. 
representative, 53, 67. 

rotation operator, 142. 


seatterer, 185. 

Schrédinger dynamical variable, 113. 
— picture, 111. 

Schrédinger’s representation, 93. 
— wave equation, 111. 

second quantization, 230, 251. 
selection rule, 159. 

self-adjoint, 27. 

similar permutations, 212. 
simultaneous eigenstate, 49. 
Sommerfeld’s formula, 272. 
spherical harmonic, 154. 

— symmetry, 143. 

spin angular momentum, 142, 267. 
— of electron, 149, 266. 

square root of an observable, 44. 
standard ket, 79. 

state, 11. : 

— of absorption, 187. 

— of motion, 12. 

— of polarization, 5. 

stationary state, 116. 

stimulated emission, 177, 238. 
strong equation, 289. 
superposition of states, 12. 
supplementary condition, 287, 
symmetrical ket, state, 208. 

— representation, 208. 
symmetrizing operator, 225, 


time-dependent wave function, 111. 
transformation function, 75. 
translational state, 7. 

transverse energy, 286, 291. * 

— field, 277. 


uncertainty principle, 98, 
unit matrix, 68, 69. 
unitary, 104. 


wave equation, 111, 

— function, 80. 
«mechanics, 14. 

— packet, 97, 121. 

weak equation, 289, 
weight function, 66. 
well-ordered function, 130. 
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